Do Services-as-Software

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forward pass (400B total). It supports multilingual text and image input, and produces multilingual text and code output across 12 supported languages. Optimized for vision-language tasks, Maverick is instruction-tuned for assistant-like behavior, image reasoning, and general-purpose multimodal interaction.

Maverick features early fusion for native multimodality and a 1 million token context window. It was trained on a curated mixture of public, licensed, and Meta-platform data, covering ~22 trillion tokens, with a knowledge cutoff in August 2024. Released on April 5, 2025 under the Llama 4 Community License, Maverick is suited for research and commercial applications requiring advanced multimodal understanding and high model throughput.

Available On

Provider	Model ID	Context	Max Output	Input Cost	Output Cost	Throughput	Latency
DeepInfra	deepInfra	1049K	16K	$0.15/M	$0.60/M	95.7 t/s	278 ms
Parasail	parasail	1049K	1049K	$0.15/M	$0.85/M	160.1 t/s	255 ms
Kluster	klusterAi	1049K	1049K	$0.16/M	$0.80/M	120.2 t/s	767 ms
Novita	novitaAi	1049K	1049K	$0.17/M	$0.85/M	65.4 t/s	424 ms
Lambda	lambda	1049K	1049K	$0.18/M	$0.60/M	153.1 t/s	287 ms
BaseTen	baseten	1000K	131K	$0.19/M	$0.72/M	203.6 t/s	153 ms
Cent-ML	centMl	1049K	1049K	$0.20/M	$0.20/M	72.3 t/s	257 ms
Groq	groq	131K	8K	$0.20/M	$0.60/M	1193.5 t/s	254 ms
NCompass	nCompass	400K	400K	$0.20/M	$0.70/M	142.9 t/s	94 ms
Fireworks	fireworks	1049K	-	$0.22/M	$0.88/M	95.8 t/s	489 ms
GMICloud	gmiCloud	1049K	-	$0.25/M	$0.80/M	155.3 t/s	529 ms
Together	together	1049K	-	$0.27/M	$0.85/M	88.0 t/s	299 ms
Google	vertex	524K	-	$0.35/M	$1.15/M	102.4 t/s	795 ms
DeepInfra	deepInfraTurbo	8K	-	$0.50/M	$0.50/M	411.8 t/s	475 ms
SambaNova	sambaNova	131K	4K	$0.63/M	$1.80/M	640.9 t/s	952 ms

Meta: Llama 4 Maverick

1,048,576 Token Context

Advanced Coding

Agentic Workflows

Vision Capabilities

Available On

Do Work. With AI.