Models.do

Llama 3.2 1B is a 1-billion-parameter language model focused on efficiently performing natural language tasks, such as summarization, dialogue, and multilingual text analysis. Its smaller size allows it to operate efficiently in low-resource environments while maintaining strong task performance.

Supporting eight core languages and fine-tunable for more, Llama 1.3B is ideal for businesses or developers seeking lightweight yet powerful AI solutions that can operate in diverse multilingual settings without the high computational demand of larger models.

Click here for the original model card.

Usage of this model is subject to Meta's Acceptable Use Policy.

Provider	Model ID	Context	Max Output	Input Cost	Output Cost	Throughput	Latency
DeepInfra	deepInfra	131K	16K	$0.01/M	$0.01/M	117.5 t/s	558 ms
InferenceNet	inferenceNet	16K	16K	$0.01/M	$0.01/M	225.4 t/s	786 ms
Cloudflare	cloudflare	60K	-	$0.03/M	$0.20/M	66.3 t/s	411 ms
SambaNova	sambaNova	16K	4K	$0.04/M	$0.08/M	3386.9 t/s	330 ms

Provider

Model ID

Context

Max Output

Input Cost

Output Cost

Throughput

Latency

DeepInfra

deepInfra

131K

16K

$0.01/M

117.5 t/s

558 ms

InferenceNet

inferenceNet

16K

$0.01/M

225.4 t/s

786 ms

Cloudflare

cloudflare

60K

$0.03/M

$0.20/M

66.3 t/s

411 ms

SambaNova

sambaNova

16K

$0.04/M

$0.08/M

3386.9 t/s

330 ms

Meta: Llama 3.2 1B Instruct

131,072 Token Context

Advanced Coding

Agentic Workflows

Available On

Standard Pricing

Do Work. With AI.