Mistral: Ministral 8B

Mistral

Input: text

Output: text

Released: Oct 17, 2024•Updated: Mar 28, 2025

Ministral 8B is an 8B parameter model featuring a unique interleaved sliding-window attention pattern for faster, memory-efficient inference. Designed for edge use cases, it supports up to 128k context length and excels in knowledge and reasoning tasks. It outperforms peers in the sub-10B category, making it perfect for low-latency, privacy-first applications.

128,000 Token Context

Process and analyze large documents and conversations.

Advanced Coding

Improved capabilities in front-end development and full-stack updates.

Agentic Workflows

Autonomously navigate multi-step processes with improved reliability.

Available On

Provider	Model ID	Context	Max Output	Input Cost	Output Cost	Throughput	Latency
Mistral	mistral	128K	-	$0.10/M	$0.10/M	135.0 t/s	228 ms

Standard Pricing

Input Tokens

per 1M tokens

$0.10

Output Tokens

per 1M tokens

$0.10

Do Work. With AI.

Join Waitlist Learn more