Models.do

Mercury Coder Small is the first diffusion large language model (dLLM). Applying a breakthrough discrete diffusion approach, the model runs 5-10x faster than even speed optimized models like Claude 3.5 Haiku and GPT-4o Mini while matching their performance. Mercury Coder Small's speed means that developers can stay in the flow while coding, enjoying rapid chat-based iteration and responsive code completion suggestions. On Copilot Arena, Mercury Coder ranks 1st in speed and ties for 2nd in quality. Read more in the blog post here.

Provider	Model ID	Context	Max Output	Input Cost	Output Cost	Throughput	Latency
Inception	inception	32K	-	$0.25/M	$1.00/M	1337.3 t/s	474 ms

Provider

Model ID

Context

Max Output

Input Cost

Output Cost

Throughput

Latency

Inception

inception

32K

$0.25/M

$1.00/M

1337.3 t/s

474 ms

Inception: Mercury Coder Small Beta

32,000 Token Context

Advanced Coding

Agentic Workflows

Available On

Standard Pricing

Do Work. With AI.