Meta: Llama 4 Scout
Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input (text and image) and multilingual output (text and code) across 12 supported languages. Designed for assistant-style interaction and visual reasoning, Scout uses 16 experts per forward pass and features a context length of 10 million tokens, with a training corpus of ~40 trillion tokens.
Built for high efficiency and local or commercial deployment, Llama 4 Scout incorporates early fusion for seamless modality integration. It is instruction-tuned for use in multilingual chat, captioning, and image understanding tasks. Released under the Llama 4 Community License, it was last trained on data up to August 2024 and launched publicly on April 5, 2025.
1,048,576 Token Context
Process and analyze large documents and conversations.
Advanced Coding
Improved capabilities in front-end development and full-stack updates.
Agentic Workflows
Autonomously navigate multi-step processes with improved reliability.
Vision Capabilities
Process and understand images alongside text inputs.
Available On
Provider | Model ID | Context | Max Output | Input Cost | Output Cost | Throughput | Latency |
---|---|---|---|---|---|---|---|
Lambda | lambda | 1049K | 1049K | $0.08/M | $0.30/M | 96.5 t/s | 629 ms |
DeepInfra | deepInfra | 328K | 16K | $0.08/M | $0.30/M | 31.0 t/s | 480 ms |
Kluster | klusterAi | 131K | 131K | $0.08/M | $0.45/M | 78.8 t/s | 782 ms |
GMICloud | gmiCloud | 1049K | - | $0.08/M | $0.50/M | 111.3 t/s | 572 ms |
Parasail | parasail | 158K | 158K | $0.09/M | $0.48/M | 106.2 t/s | 444 ms |
Cent-ML | centMl | 1049K | 1049K | $0.10/M | $0.10/M | 80.1 t/s | 373 ms |
Novita | novitaAi | 131K | 131K | $0.10/M | $0.50/M | 70.0 t/s | 879 ms |
Groq | groq | 131K | 8K | $0.11/M | $0.34/M | 801.8 t/s | 353 ms |
BaseTen | baseten | 1000K | 131K | $0.13/M | $0.50/M | 124.3 t/s | 246 ms |
Fireworks | fireworks | 1049K | - | $0.15/M | $0.60/M | 78.9 t/s | 634 ms |
Together | together | 1049K | - | $0.18/M | $0.59/M | 100.2 t/s | 544 ms |
vertex | 1311K | - | $0.25/M | $0.70/M | 117.7 t/s | 1850 ms | |
SambaNova | sambaNova | 8K | 4K | $0.40/M | $0.70/M | 694.4 t/s | 1897 ms |
Cerebras | cerebras | 32K | 32K | $0.65/M | $0.85/M | 3291.0 t/s | 420 ms |
per 1K tokens
per 1K tokens