Meta: Llama 3.1 8B Instruct

Llama3

Input: text

Output: text

Released: Jul 23, 2024•Updated: Mar 28, 2025

Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 8B instruct-tuned version is fast and efficient.

It has demonstrated strong performance compared to leading closed-source models in human evaluations.

To read more about the model release, click here. Usage of this model is subject to Meta's Acceptable Use Policy.

Process and analyze large documents and conversations.

Improved capabilities in front-end development and full-stack updates.

Autonomously navigate multi-step processes with improved reliability.

Available On

Provider	Model ID	Context	Max Output	Input Cost	Output Cost	Throughput	Latency
Kluster	klusterAi	131K	131K	$0.02/M	$0.03/M	109.6 t/s	453 ms
DeepInfra	deepInfra (turbo)	131K	16K	$0.02/M	$0.03/M	105.4 t/s	348 ms
InferenceNet	inferenceNet	16K	16K	$0.02/M	$0.03/M	44.7 t/s	1409 ms
Novita	novitaAi	16K	131K	$0.02/M	$0.05/M	78.9 t/s	938 ms
Nebius	nebiusAiStudio	131K	-	$0.02/M	$0.06/M	61.1 t/s	365 ms
Lambda	lambda	131K	131K	$0.02/M	$0.04/M	43.0 t/s	391 ms
DeepInfra	deepInfra	131K	16K	$0.03/M	$0.05/M	56.0 t/s	371 ms
Cloudflare	cloudflare	32K	-	$0.04/M	$0.38/M	23.7 t/s	785 ms
Groq	groq	131K	131K	$0.05/M	$0.08/M	1397.0 t/s	272 ms
Hyperbolic	hyperbolic	131K	-	$0.10/M	$0.10/M	269.9 t/s	966 ms
Cerebras	cerebras	32K	32K	$0.10/M	$0.10/M	4664.8 t/s	604 ms
Friendli	friendli	131K	8K	$0.10/M	$0.10/M	297.0 t/s	162 ms
SambaNova	sambaNova	16K	4K	$0.10/M	$0.20/M	894.2 t/s	180 ms
Together	together	131K	-	$0.18/M	$0.18/M	212.1 t/s	232 ms
Fireworks	fireworks	131K	-	$0.20/M	$0.20/M	292.6 t/s	288 ms
Avian	avianIo	131K	-	$0.20/M	$0.20/M	-	-

Input Tokens

per 1M tokens

$0.02

Output Tokens

per 1M tokens

$0.03