Back

Meta: Llama 4 Scout

Llama4
Input: text
Input: image
Output: text
Released: Apr 5, 2025Updated: May 16, 2025

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input (text and image) and multilingual output (text and code) across 12 supported languages. Designed for assistant-style interaction and visual reasoning, Scout uses 16 experts per forward pass and features a context length of 10 million tokens, with a training corpus of ~40 trillion tokens.

Built for high efficiency and local or commercial deployment, Llama 4 Scout incorporates early fusion for seamless modality integration. It is instruction-tuned for use in multilingual chat, captioning, and image understanding tasks. Released under the Llama 4 Community License, it was last trained on data up to August 2024 and launched publicly on April 5, 2025.

1,048,576 Token Context

Process and analyze large documents and conversations.

Advanced Coding

Improved capabilities in front-end development and full-stack updates.

Agentic Workflows

Autonomously navigate multi-step processes with improved reliability.

Vision Capabilities

Process and understand images alongside text inputs.

Available On

ProviderModel IDContextMax OutputInput CostOutput CostThroughputLatency
Lambdalambda1049K1049K$0.08/M$0.30/M96.2 t/s481 ms
DeepInfradeepInfra328K16K$0.08/M$0.30/M38.9 t/s744 ms
kluster.aiklusterAi131K131K$0.08/M$0.45/M72.3 t/s755 ms
Parasailparasail158K158K$0.09/M$0.48/M101.6 t/s604 ms
CentMLcentMl1049K1049K$0.10/M$0.10/M84.0 t/s319 ms
NovitaAInovitaAi131K131K$0.10/M$0.50/M48.1 t/s995 ms
Groqgroq131K8K$0.11/M$0.34/M685.2 t/s316 ms
Fireworksfireworks1049K-$0.15/M$0.60/M96.3 t/s522 ms
GMICloudgmiCloud1049K-$0.15/M$0.60/M107.6 t/s791 ms
Togethertogether1049K-$0.18/M$0.59/M103.3 t/s388 ms
Cerebrascerebras32K32K$0.65/M$0.85/M1760.9 t/s384 ms
Standard Pricing
Input Tokens
$0.00000008

per 1K tokens

Output Tokens
$0.0000003

per 1K tokens

Do Work. With AI.