Mistral: Mistral Nemo

Mistral

Input: text

Output: text

Released: Jul 19, 2024•Updated: Mar 28, 2025

A 12B parameter model with a 128k token context length built by Mistral in collaboration with NVIDIA.

The model is multilingual, supporting English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, and Hindi.

It supports function calling and is released under the Apache 2.0 license.

Process and analyze large documents and conversations.

Improved capabilities in front-end development and full-stack updates.

Autonomously navigate multi-step processes with improved reliability.

Available On

Provider	Model ID	Context	Max Output	Input Cost	Output Cost	Throughput	Latency
DeepInfra	deepInfra	131K	16K	$0.01/M	$0.03/M	52.2 t/s	234 ms
Kluster	klusterAi	131K	131K	$0.01/M	$0.03/M	93.7 t/s	915 ms
Enfer	enfer	131K	131K	$0.02/M	$0.07/M	39.9 t/s	1147 ms
Parasail	parasail	131K	131K	$0.03/M	$0.11/M	147.3 t/s	387 ms
NextBit	nextBit	128K	-	$0.03/M	$0.07/M	45.6 t/s	1595 ms
InferenceNet	inferenceNet	16K	16K	$0.04/M	$0.10/M	64.7 t/s	928 ms
Nebius	nebiusAiStudio	128K	-	$0.04/M	$0.12/M	44.4 t/s	507 ms
Novita	novitaAi	60K	32K	$0.04/M	$0.17/M	60.0 t/s	1087 ms
InoCloud	inoCloud	131K	131K	$0.07/M	$0.07/M	101.4 t/s	1297 ms
Mistral	mistral	131K	-	$0.15/M	$0.15/M	107.8 t/s	253 ms
Azure	azure	128K	-	$0.30/M	$0.30/M	101.1 t/s	1129 ms

Input Tokens

per 1M tokens

$0.01

Output Tokens

per 1M tokens

$0.03