Meta: Llama 3.2 3B Instruct
Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization. Designed with the latest transformer architecture, it supports eight languages, including English, Spanish, and Hindi, and is adaptable for additional languages.
Trained on 9 trillion tokens, the Llama 3.2 3B model excels in instruction-following, complex reasoning, and tool use. Its balanced performance makes it ideal for applications needing accuracy and efficiency in text generation across multilingual settings.
Click here for the original model card.
Usage of this model is subject to Meta's Acceptable Use Policy.
131,072 Token Context
Process and analyze large documents and conversations.
Advanced Coding
Improved capabilities in front-end development and full-stack updates.
Agentic Workflows
Autonomously navigate multi-step processes with improved reliability.
Available On
Provider | Model ID | Context | Max Output | Input Cost | Output Cost | Throughput | Latency |
---|---|---|---|---|---|---|---|
DeepInfra | deepInfra | 131K | 16K | $0.01/M | $0.02/M | 104.1 t/s | 222 ms |
Nebius AI Studio | nebiusAiStudio | 131K | - | $0.01/M | $0.02/M | 101.3 t/s | 194 ms |
Lambda | lambda | 131K | 131K | $0.01/M | $0.02/M | 283.3 t/s | 175 ms |
inference.net | inferenceNet | 16K | 16K | $0.02/M | $0.02/M | 84.9 t/s | 883 ms |
NovitaAI | novitaAi | 33K | - | $0.03/M | $0.05/M | 100.9 t/s | 527 ms |
Cloudflare | cloudflare | 128K | - | $0.05/M | $0.34/M | 171.3 t/s | 398 ms |
Together | together | 131K | 16K | $0.06/M | $0.06/M | 169.0 t/s | 418 ms |
SambaNova | sambaNova | 4K | 4K | $0.08/M | $0.16/M | 1978.0 t/s | 421 ms |
Hyperbolic | hyperbolic | 131K | - | $0.10/M | $0.10/M | 99.6 t/s | 985 ms |
per 1K tokens
per 1K tokens