Back
Mistral: Mistral Small 3
Mistral
Input: text
Output: text
Released: Jan 30, 2025•Updated: Mar 28, 2025
Mistral Small 3 is a 24B-parameter language model optimized for low-latency performance across common AI tasks. Released under the Apache 2.0 license, it features both pre-trained and instruction-tuned versions designed for efficient local deployment.
The model achieves 81% accuracy on the MMLU benchmark and performs competitively with larger models like Llama 3.3 70B and Qwen 32B, while operating at three times the speed on equivalent hardware. Read the blog post about the model here.
28,000 Token Context
Process and analyze large documents and conversations.
Advanced Coding
Improved capabilities in front-end development and full-stack updates.
Agentic Workflows
Autonomously navigate multi-step processes with improved reliability.
Available On
Provider | Model ID | Context | Max Output | Input Cost | Output Cost | Throughput | Latency |
---|---|---|---|---|---|---|---|
Enfer | enfer | 28K | 14K | $0.06/M | $0.12/M | 30.2 t/s | 7192 ms |
NextBit | nextBit | 33K | - | $0.07/M | $0.13/M | 28.4 t/s | 2230 ms |
DeepInfra | deepInfra | 33K | 16K | $0.07/M | $0.14/M | 78.2 t/s | 269 ms |
Mistral | mistral | 33K | - | $0.10/M | $0.30/M | 150.7 t/s | 257 ms |
Ubicloud | ubicloud | 33K | 33K | $0.30/M | $0.30/M | 32.5 t/s | 1207 ms |
Together | together | 33K | 2K | $0.80/M | $0.80/M | 80.3 t/s | 501 ms |
Standard Pricing
Input Tokens
$0.00000006
per 1K tokens
Output Tokens
$0.00000012
per 1K tokens