Back
Meta: Llama 3.3 70B Instruct
Llama3
Input: text
Output: text
Released: Dec 6, 2024•Updated: Mar 28, 2025
The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks.
Supported languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
131,072 Token Context
Process and analyze large documents and conversations.
Advanced Coding
Improved capabilities in front-end development and full-stack updates.
Agentic Workflows
Autonomously navigate multi-step processes with improved reliability.
Available On
Provider | Model ID | Context | Max Output | Input Cost | Output Cost | Throughput | Latency |
---|---|---|---|---|---|---|---|
DeepInfra | deepInfraTurbo | 131K | 16K | $0.07/M | $0.25/M | 31.0 t/s | 286 ms |
Kluster | klusterAi | 131K | 131K | $0.07/M | $0.33/M | 25.6 t/s | 699 ms |
Lambda | lambda | 131K | 131K | $0.12/M | $0.30/M | 60.8 t/s | 334 ms |
Phala | phala | 131K | - | $0.12/M | $0.35/M | 25.2 t/s | 705 ms |
Novita | novitaAi | 131K | 120K | $0.13/M | $0.39/M | 41.2 t/s | 761 ms |
Crusoe | crusoe | 131K | 2K | $0.13/M | $0.40/M | 25.1 t/s | 1309 ms |
Nebius | nebiusAiStudio | 131K | - | $0.13/M | $0.40/M | 35.9 t/s | 566 ms |
DeepInfra | deepInfra | 131K | - | $0.23/M | $0.40/M | 26.4 t/s | 586 ms |
Parasail | parasail | 131K | 131K | $0.28/M | $0.78/M | 65.3 t/s | 567 ms |
NextBit | nextBit | 33K | - | $0.28/M | $2.20/M | 25.7 t/s | 2337 ms |
Cloudflare | cloudflare | 24K | - | $0.29/M | $2.25/M | 24.8 t/s | 625 ms |
Cent-ML | centMl | 131K | 131K | $0.35/M | $0.35/M | 90.7 t/s | 509 ms |
InoCloud | inoCloud | 131K | 131K | $0.35/M | $0.50/M | 27.7 t/s | 1030 ms |
Hyperbolic | hyperbolic | 131K | - | $0.40/M | $0.40/M | 30.4 t/s | 1270 ms |
Atoma | atoma | 105K | 100K | $0.40/M | $0.40/M | 31.6 t/s | 765 ms |
Groq | groq | 131K | 33K | $0.59/M | $0.79/M | 379.8 t/s | 314 ms |
Friendli | friendli | 131K | 131K | $0.60/M | $0.60/M | 113.4 t/s | 610 ms |
SambaNova | sambaNova | 131K | 3K | $0.60/M | $1.20/M | 346.6 t/s | 697 ms |
vertex | 128K | - | $0.72/M | $0.72/M | 77.9 t/s | 221 ms | |
Cerebras | cerebras | 32K | 32K | $0.85/M | $1.20/M | 2254.1 t/s | 256 ms |
Together | together | 131K | 2K | $0.88/M | $0.88/M | 129.1 t/s | 517 ms |
Fireworks | fireworks | 131K | - | $0.90/M | $0.90/M | 104.4 t/s | 707 ms |
InferenceNet | inferenceNet | 128K | 16K | $0.10/M | $0.25/M | 18.1 t/s | 1692 ms |
Standard Pricing
Input Tokens
$0.00000007
per 1K tokens
Output Tokens
$0.00000025
per 1K tokens