Back
DeepSeek: R1 Distill Llama 8B
Llama3
Input: text
Output: text
Released: Feb 7, 2025•Updated: May 2, 2025
DeepSeek R1 Distill Llama 8B is a distilled large language model based on Llama-3.1-8B-Instruct, using outputs from DeepSeek R1. The model combines advanced distillation techniques to achieve high performance across multiple benchmarks, including:
- AIME 2024 pass@1: 50.4
- MATH-500 pass@1: 89.1
- CodeForces Rating: 1205
The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.
Hugging Face:
32,000 Token Context
Process and analyze large documents and conversations.
Hybrid Reasoning
Choose between rapid responses and extended, step-by-step processing for complex tasks.
Advanced Coding
Improved capabilities in front-end development and full-stack updates.
Agentic Workflows
Autonomously navigate multi-step processes with improved reliability.
Available On
Provider | Model ID | Context | Max Output | Input Cost | Output Cost | Throughput | Latency |
---|---|---|---|---|---|---|---|
NovitaAI | novitaAi | 32K | 32K | $0.04/M | $0.04/M | 47.0 t/s | 1485 ms |
Standard Pricing
Input Tokens
$0.00000004
per 1K tokens
Output Tokens
$0.00000004
per 1K tokens