Back

Meta: Llama 4 Scout

Llama4
Input: text
Input: image
Output: text
Released: Apr 5, 2025Updated: May 16, 2025

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input (text and image) and multilingual output (text and code) across 12 supported languages. Designed for assistant-style interaction and visual reasoning, Scout uses 16 experts per forward pass and features a context length of 10 million tokens, with a training corpus of ~40 trillion tokens.

Built for high efficiency and local or commercial deployment, Llama 4 Scout incorporates early fusion for seamless modality integration. It is instruction-tuned for use in multilingual chat, captioning, and image understanding tasks. Released under the Llama 4 Community License, it was last trained on data up to August 2024 and launched publicly on April 5, 2025.

1,048,576 Token Context

Process and analyze large documents and conversations.

Advanced Coding

Improved capabilities in front-end development and full-stack updates.

Agentic Workflows

Autonomously navigate multi-step processes with improved reliability.

Vision Capabilities

Process and understand images alongside text inputs.

Available On

ProviderModel IDContextMax OutputInput CostOutput CostThroughputLatency
Lambdalambda1049K1049K$0.08/M$0.30/M95.2 t/s359 ms
DeepInfradeepInfra328K16K$0.08/M$0.30/M30.5 t/s497 ms
KlusterklusterAi131K131K$0.08/M$0.45/M80.5 t/s761 ms
GMICloudgmiCloud1049K-$0.08/M$0.50/M107.4 t/s655 ms
Parasailparasail158K158K$0.09/M$0.48/M106.2 t/s453 ms
Cent-MLcentMl1049K1049K$0.10/M$0.10/M85.6 t/s332 ms
NovitanovitaAi131K131K$0.10/M$0.50/M45.3 t/s900 ms
Groqgroq131K8K$0.11/M$0.34/M846.7 t/s279 ms
BaseTenbaseten1000K131K$0.13/M$0.50/M120.1 t/s265 ms
Fireworksfireworks1049K-$0.15/M$0.60/M80.2 t/s595 ms
Togethertogether1049K-$0.18/M$0.59/M97.1 t/s570 ms
Googlevertex1311K-$0.25/M$0.70/M116.6 t/s1731 ms
SambaNovasambaNova8K4K$0.40/M$0.70/M676.7 t/s2179 ms
Cerebrascerebras32K32K$0.65/M$0.85/M2103.0 t/s278 ms
Standard Pricing
Input Tokens
$0.00000008

per 1K tokens

Output Tokens
$0.0000003

per 1K tokens

Do Work. With AI.