Back

Meta: Llama 3.3 70B Instruct

Llama3
Input: text
Output: text
Released: Dec 6, 2024Updated: Mar 28, 2025

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks.

Supported languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

Model Card

131,072 Token Context

Process and analyze large documents and conversations.

Advanced Coding

Improved capabilities in front-end development and full-stack updates.

Agentic Workflows

Autonomously navigate multi-step processes with improved reliability.

Available On

ProviderModel IDContextMax OutputInput CostOutput CostThroughputLatency
DeepInfradeepInfraTurbo131K16K$0.07/M$0.25/M28.6 t/s263 ms
KlusterklusterAi131K131K$0.07/M$0.33/M35.7 t/s826 ms
Lambdalambda131K131K$0.12/M$0.30/M61.9 t/s379 ms
Phalaphala131K-$0.12/M$0.35/M34.7 t/s562 ms
NovitanovitaAi131K120K$0.13/M$0.39/M73.2 t/s698 ms
Crusoecrusoe131K2K$0.13/M$0.40/M31.2 t/s1030 ms
NebiusnebiusAiStudio131K-$0.13/M$0.40/M34.2 t/s649 ms
DeepInfradeepInfra131K-$0.23/M$0.40/M25.4 t/s404 ms
Parasailparasail131K131K$0.28/M$0.78/M85.3 t/s459 ms
NextBitnextBit33K-$0.28/M$2.20/M27.0 t/s2967 ms
Cloudflarecloudflare24K-$0.29/M$2.25/M33.6 t/s487 ms
Cent-MLcentMl131K131K$0.35/M$0.35/M71.0 t/s629 ms
InoCloudinoCloud131K131K$0.35/M$0.50/M30.3 t/s1144 ms
Hyperbolichyperbolic131K-$0.40/M$0.40/M36.9 t/s1145 ms
Groqgroq131K33K$0.59/M$0.79/M383.9 t/s444 ms
Friendlifriendli131K131K$0.60/M$0.60/M107.5 t/s749 ms
SambaNovasambaNova131K3K$0.60/M$1.20/M347.1 t/s790 ms
Googlevertex128K-$0.72/M$0.72/M81.0 t/s380 ms
Cerebrascerebras32K32K$0.85/M$1.20/M2926.8 t/s268 ms
Togethertogether131K2K$0.88/M$0.88/M52.1 t/s1121 ms
Fireworksfireworks131K-$0.90/M$0.90/M74.7 t/s705 ms
InferenceNetinferenceNet128K16K$0.10/M$0.25/M--
Standard Pricing
Input Tokens
$0.00000007

per 1K tokens

Output Tokens
$0.00000025

per 1K tokens

Do Work. With AI.