Back

Mistral: Mistral Small 3

Mistral
Input: text
Output: text
Released: Jan 30, 2025Updated: Mar 28, 2025

Mistral Small 3 is a 24B-parameter language model optimized for low-latency performance across common AI tasks. Released under the Apache 2.0 license, it features both pre-trained and instruction-tuned versions designed for efficient local deployment.

The model achieves 81% accuracy on the MMLU benchmark and performs competitively with larger models like Llama 3.3 70B and Qwen 32B, while operating at three times the speed on equivalent hardware. Read the blog post about the model here.

28,000 Token Context

Process and analyze large documents and conversations.

Advanced Coding

Improved capabilities in front-end development and full-stack updates.

Agentic Workflows

Autonomously navigate multi-step processes with improved reliability.

Available On

ProviderModel IDContextMax OutputInput CostOutput CostThroughputLatency
Enferenfer28K14K$0.06/M$0.12/M30.2 t/s7192 ms
NextBitnextBit33K-$0.07/M$0.13/M28.4 t/s2230 ms
DeepInfradeepInfra33K16K$0.07/M$0.14/M78.2 t/s269 ms
Mistralmistral33K-$0.10/M$0.30/M150.7 t/s257 ms
Ubicloudubicloud33K33K$0.30/M$0.30/M32.5 t/s1207 ms
Togethertogether33K2K$0.80/M$0.80/M80.3 t/s501 ms
Standard Pricing
Input Tokens
$0.00000006

per 1K tokens

Output Tokens
$0.00000012

per 1K tokens

Do Work. With AI.