Meta: Llama 3.2 3B Instruct

Llama3

Input: text

Output: text

Released: Sep 25, 2024•Updated: Mar 28, 2025

Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization. Designed with the latest transformer architecture, it supports eight languages, including English, Spanish, and Hindi, and is adaptable for additional languages.

Trained on 9 trillion tokens, the Llama 3.2 3B model excels in instruction-following, complex reasoning, and tool use. Its balanced performance makes it ideal for applications needing accuracy and efficiency in text generation across multilingual settings.

Click here for the original model card.

Usage of this model is subject to Meta's Acceptable Use Policy.

131,072 Token Context

Process and analyze large documents and conversations.

Advanced Coding

Improved capabilities in front-end development and full-stack updates.

Agentic Workflows

Autonomously navigate multi-step processes with improved reliability.

Available On

Provider	Model ID	Context	Max Output	Input Cost	Output Cost	Throughput	Latency
DeepInfra	deepInfra	131K	16K	$0.01/M	$0.02/M	101.3 t/s	454 ms
Lambda	lambda	131K	131K	$0.01/M	$0.02/M	242.7 t/s	280 ms
InferenceNet	inferenceNet	16K	16K	$0.02/M	$0.02/M	111.7 t/s	824 ms
Novita	novitaAi	33K	32K	$0.03/M	$0.05/M	115.7 t/s	654 ms
Cloudflare	cloudflare	128K	-	$0.05/M	$0.34/M	180.0 t/s	315 ms
Together	together	131K	16K	$0.06/M	$0.06/M	164.0 t/s	292 ms
SambaNova	sambaNova	4K	4K	$0.08/M	$0.16/M	3139.5 t/s	295 ms
Hyperbolic	hyperbolic	131K	-	$0.10/M	$0.10/M	112.6 t/s	1244 ms

Standard Pricing

Input Tokens

per 1M tokens

$0.01

Output Tokens

per 1M tokens

$0.02

Do Work. With AI.

Join Waitlist Learn more