Back

Meta: Llama 3.3 70B Instruct

Llama3
Input: text
Output: text
Released: Dec 6, 2024Updated: Mar 28, 2025

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks.

Supported languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

Model Card

131,000 Token Context

Process and analyze large documents and conversations.

Advanced Coding

Improved capabilities in front-end development and full-stack updates.

Agentic Workflows

Autonomously navigate multi-step processes with improved reliability.

Available On

ProviderModel IDContextMax OutputInput CostOutput CostThroughputLatency
kluster.aiklusterAi131K131K$0.07/M$0.33/M32.6 t/s646 ms
DeepInfradeepInfra131K16K$0.08/M$0.25/M33.3 t/s264 ms
Lambdalambda131K131K$0.12/M$0.30/M61.5 t/s480 ms
Phalaphala131K-$0.12/M$0.35/M30.8 t/s682 ms
NovitaAInovitaAi131K-$0.13/M$0.39/M83.2 t/s656 ms
Nebius AI StudionebiusAiStudio131K-$0.13/M$0.40/M40.9 t/s627 ms
Parasailparasail131K131K$0.28/M$0.78/M78.5 t/s480 ms
Cloudflarecloudflare24K-$0.29/M$2.25/M33.9 t/s657 ms
CentMLcentMl131K131K$0.35/M$0.35/M71.7 t/s645 ms
Hyperbolichyperbolic131K-$0.40/M$0.40/M38.4 t/s1188 ms
Atomaatoma105K100K$0.40/M$0.40/M28.8 t/s857 ms
Groqgroq131K33K$0.59/M$0.79/M353.8 t/s269 ms
Friendlifriendli131K131K$0.60/M$0.60/M100.7 t/s753 ms
NextBitnextBit33K-$0.60/M$0.75/M33.4 t/s2122 ms
SambaNovasambaNova131K3K$0.60/M$1.20/M323.7 t/s645 ms
Google Vertexvertex128K-$0.72/M$0.72/M69.7 t/s781 ms
Cerebrascerebras32K32K$0.85/M$1.20/M2910.1 t/s151 ms
Togethertogether131K2K$0.88/M$0.88/M98.8 t/s436 ms
Fireworksfireworks131K-$0.90/M$0.90/M105.2 t/s1810 ms
inference.netinferenceNet128K16K$0.10/M$0.25/M17.9 t/s1811 ms
Standard Pricing
Input Tokens
$0.00000007

per 1K tokens

Output Tokens
$0.00000033

per 1K tokens

Do Work. With AI.