Meta: Llama 3.3 70B Instruct

Llama3

Input: text

Output: text

Released: Dec 6, 2024•Updated: Mar 28, 2025

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks.

Supported languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

Model Card

131,072 Token Context

Process and analyze large documents and conversations.

Advanced Coding

Improved capabilities in front-end development and full-stack updates.

Agentic Workflows

Autonomously navigate multi-step processes with improved reliability.

Available On

Provider	Model ID	Context	Max Output	Input Cost	Output Cost	Throughput	Latency
DeepInfra	deepInfraTurbo	131K	16K	$0.07/M	$0.25/M	28.6 t/s	263 ms
Kluster	klusterAi	131K	131K	$0.07/M	$0.33/M	35.7 t/s	826 ms
Lambda	lambda	131K	131K	$0.12/M	$0.30/M	61.9 t/s	379 ms
Phala	phala	131K	-	$0.12/M	$0.35/M	34.7 t/s	562 ms
Novita	novitaAi	131K	120K	$0.13/M	$0.39/M	73.2 t/s	698 ms
Crusoe	crusoe	131K	2K	$0.13/M	$0.40/M	31.2 t/s	1030 ms
Nebius	nebiusAiStudio	131K	-	$0.13/M	$0.40/M	34.2 t/s	649 ms
DeepInfra	deepInfra	131K	-	$0.23/M	$0.40/M	25.4 t/s	404 ms
Parasail	parasail	131K	131K	$0.28/M	$0.78/M	85.3 t/s	459 ms
NextBit	nextBit	33K	-	$0.28/M	$2.20/M	27.0 t/s	2967 ms
Cloudflare	cloudflare	24K	-	$0.29/M	$2.25/M	33.6 t/s	487 ms
Cent-ML	centMl	131K	131K	$0.35/M	$0.35/M	71.0 t/s	629 ms
InoCloud	inoCloud	131K	131K	$0.35/M	$0.50/M	30.3 t/s	1144 ms
Hyperbolic	hyperbolic	131K	-	$0.40/M	$0.40/M	36.9 t/s	1145 ms
Groq	groq	131K	33K	$0.59/M	$0.79/M	383.9 t/s	444 ms
Friendli	friendli	131K	131K	$0.60/M	$0.60/M	107.5 t/s	749 ms
SambaNova	sambaNova	131K	3K	$0.60/M	$1.20/M	347.1 t/s	790 ms
Google	vertex	128K	-	$0.72/M	$0.72/M	81.0 t/s	380 ms
Cerebras	cerebras	32K	32K	$0.85/M	$1.20/M	2926.8 t/s	268 ms
Together	together	131K	2K	$0.88/M	$0.88/M	52.1 t/s	1121 ms
Fireworks	fireworks	131K	-	$0.90/M	$0.90/M	74.7 t/s	705 ms
InferenceNet	inferenceNet	128K	16K	$0.10/M	$0.25/M	-	-

Standard Pricing

Input Tokens

per 1M tokens

$0.07

Output Tokens