Models.do

Maestro Reasoning is Arcee's flagship analysis model: a 32 B‑parameter derivative of Qwen 2.5‑32 B tuned with DPO and chain‑of‑thought RL for step‑by‑step logic. Compared to the earlier 7 B preview, the production 32 B release widens the context window to 128 k tokens and doubles pass‑rate on MATH and GSM‑8K, while also lifting code completion accuracy. Its instruction style encourages structured "thought → answer" traces that can be parsed or hidden according to user preference. That transparency pairs well with audit‑focused industries like finance or healthcare where seeing the reasoning path matters. In Arcee Conductor, Maestro is automatically selected for complex, multi‑constraint queries that smaller SLMs bounce.

Provider	Model ID	Context	Max Output	Input Cost	Output Cost	Throughput	Latency
Together	together	131K	32K	$0.90/M	$3.30/M	115.4 t/s	390 ms

Provider

Model ID

Context

Max Output

Input Cost

Output Cost

Throughput

Latency

Together

together

131K

32K

$0.90/M

$3.30/M

115.4 t/s

390 ms

Arcee AI: Maestro Reasoning

131,072 Token Context

Advanced Coding

Agentic Workflows

Available On

Standard Pricing

Do Work. With AI.