Perplexity: Llama 3.1 Sonar 8B Online

Llama3

Input: text

Output: text

Released: Aug 1, 2024•Updated: Mar 28, 2025

Llama 3.1 Sonar is Perplexity's latest model family. It surpasses their earlier Sonar models in cost-efficiency, speed, and performance.

This is the online version of the offline chat model. It is focused on delivering helpful, up-to-date, and factual responses. #online

Process and analyze large documents and conversations.

Improved capabilities in front-end development and full-stack updates.

Autonomously navigate multi-step processes with improved reliability.

Available On

Provider	Model ID	Context	Max Output	Input Cost	Output Cost	Throughput	Latency
Perplexity	perplexity	127K	-	$0.20/M	$0.20/M	204.2 t/s	1289 ms

Input Tokens

per 1M tokens

$0.20

Output Tokens

per 1M tokens

$0.20

Request

per request

$0.005