Back
Meta: Llama 3.3 70B Instruct
Llama3
Input: text
Output: text
Released: Dec 6, 2024•Updated: Mar 28, 2025
The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks.
Supported languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
131,000 Token Context
Process and analyze large documents and conversations.
Advanced Coding
Improved capabilities in front-end development and full-stack updates.
Agentic Workflows
Autonomously navigate multi-step processes with improved reliability.
Available On
Provider | Model ID | Context | Max Output | Input Cost | Output Cost | Throughput | Latency |
---|---|---|---|---|---|---|---|
kluster.ai | klusterAi | 131K | 131K | $0.07/M | $0.33/M | 32.6 t/s | 646 ms |
DeepInfra | deepInfra | 131K | 16K | $0.08/M | $0.25/M | 33.3 t/s | 264 ms |
Lambda | lambda | 131K | 131K | $0.12/M | $0.30/M | 61.5 t/s | 480 ms |
Phala | phala | 131K | - | $0.12/M | $0.35/M | 30.8 t/s | 682 ms |
NovitaAI | novitaAi | 131K | - | $0.13/M | $0.39/M | 83.2 t/s | 656 ms |
Nebius AI Studio | nebiusAiStudio | 131K | - | $0.13/M | $0.40/M | 40.9 t/s | 627 ms |
Parasail | parasail | 131K | 131K | $0.28/M | $0.78/M | 78.5 t/s | 480 ms |
Cloudflare | cloudflare | 24K | - | $0.29/M | $2.25/M | 33.9 t/s | 657 ms |
CentML | centMl | 131K | 131K | $0.35/M | $0.35/M | 71.7 t/s | 645 ms |
Hyperbolic | hyperbolic | 131K | - | $0.40/M | $0.40/M | 38.4 t/s | 1188 ms |
Atoma | atoma | 105K | 100K | $0.40/M | $0.40/M | 28.8 t/s | 857 ms |
Groq | groq | 131K | 33K | $0.59/M | $0.79/M | 353.8 t/s | 269 ms |
Friendli | friendli | 131K | 131K | $0.60/M | $0.60/M | 100.7 t/s | 753 ms |
NextBit | nextBit | 33K | - | $0.60/M | $0.75/M | 33.4 t/s | 2122 ms |
SambaNova | sambaNova | 131K | 3K | $0.60/M | $1.20/M | 323.7 t/s | 645 ms |
Google Vertex | vertex | 128K | - | $0.72/M | $0.72/M | 69.7 t/s | 781 ms |
Cerebras | cerebras | 32K | 32K | $0.85/M | $1.20/M | 2910.1 t/s | 151 ms |
Together | together | 131K | 2K | $0.88/M | $0.88/M | 98.8 t/s | 436 ms |
Fireworks | fireworks | 131K | - | $0.90/M | $0.90/M | 105.2 t/s | 1810 ms |
inference.net | inferenceNet | 128K | 16K | $0.10/M | $0.25/M | 17.9 t/s | 1811 ms |
Standard Pricing
Input Tokens
$0.00000007
per 1K tokens
Output Tokens
$0.00000033
per 1K tokens