Back

DeepSeek: R1 Distill Llama 8B

Llama3
Input: text
Output: text
Released: Feb 7, 2025Updated: May 2, 2025

DeepSeek R1 Distill Llama 8B is a distilled large language model based on Llama-3.1-8B-Instruct, using outputs from DeepSeek R1. The model combines advanced distillation techniques to achieve high performance across multiple benchmarks, including:

  • AIME 2024 pass@1: 50.4
  • MATH-500 pass@1: 89.1
  • CodeForces Rating: 1205

The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.

Hugging Face:

32,000 Token Context

Process and analyze large documents and conversations.

Hybrid Reasoning

Choose between rapid responses and extended, step-by-step processing for complex tasks.

Advanced Coding

Improved capabilities in front-end development and full-stack updates.

Agentic Workflows

Autonomously navigate multi-step processes with improved reliability.

Available On

ProviderModel IDContextMax OutputInput CostOutput CostThroughputLatency
NovitaAInovitaAi32K32K$0.04/M$0.04/M47.0 t/s1485 ms
Standard Pricing
Input Tokens
$0.00000004

per 1K tokens

Output Tokens
$0.00000004

per 1K tokens

Do Work. With AI.