Back

Meta: Llama 3.3 70B Instruct

Llama3
Input: text
Output: text
Released: Dec 6, 2024Updated: Mar 28, 2025

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks.

Supported languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

Model Card

131,072 Token Context

Process and analyze large documents and conversations.

Advanced Coding

Improved capabilities in front-end development and full-stack updates.

Agentic Workflows

Autonomously navigate multi-step processes with improved reliability.

Available On

ProviderModel IDContextMax OutputInput CostOutput CostThroughputLatency
DeepInfradeepInfraTurbo131K16K$0.07/M$0.25/M31.0 t/s286 ms
KlusterklusterAi131K131K$0.07/M$0.33/M25.6 t/s699 ms
Lambdalambda131K131K$0.12/M$0.30/M60.8 t/s334 ms
Phalaphala131K-$0.12/M$0.35/M25.2 t/s705 ms
NovitanovitaAi131K120K$0.13/M$0.39/M41.2 t/s761 ms
Crusoecrusoe131K2K$0.13/M$0.40/M25.1 t/s1309 ms
NebiusnebiusAiStudio131K-$0.13/M$0.40/M35.9 t/s566 ms
DeepInfradeepInfra131K-$0.23/M$0.40/M26.4 t/s586 ms
Parasailparasail131K131K$0.28/M$0.78/M65.3 t/s567 ms
NextBitnextBit33K-$0.28/M$2.20/M25.7 t/s2337 ms
Cloudflarecloudflare24K-$0.29/M$2.25/M24.8 t/s625 ms
Cent-MLcentMl131K131K$0.35/M$0.35/M90.7 t/s509 ms
InoCloudinoCloud131K131K$0.35/M$0.50/M27.7 t/s1030 ms
Hyperbolichyperbolic131K-$0.40/M$0.40/M30.4 t/s1270 ms
Atomaatoma105K100K$0.40/M$0.40/M31.6 t/s765 ms
Groqgroq131K33K$0.59/M$0.79/M379.8 t/s314 ms
Friendlifriendli131K131K$0.60/M$0.60/M113.4 t/s610 ms
SambaNovasambaNova131K3K$0.60/M$1.20/M346.6 t/s697 ms
Googlevertex128K-$0.72/M$0.72/M77.9 t/s221 ms
Cerebrascerebras32K32K$0.85/M$1.20/M2254.1 t/s256 ms
Togethertogether131K2K$0.88/M$0.88/M129.1 t/s517 ms
Fireworksfireworks131K-$0.90/M$0.90/M104.4 t/s707 ms
InferenceNetinferenceNet128K16K$0.10/M$0.25/M18.1 t/s1692 ms
Standard Pricing
Input Tokens
$0.00000007

per 1K tokens

Output Tokens
$0.00000025

per 1K tokens

Do Work. With AI.