Models.do

Virtuoso‑Medium‑v2 is a 32 B model distilled from DeepSeek‑v3 logits and merged back onto a Qwen 2.5 backbone, yielding a sharper, more factual successor to the original Virtuoso Medium. The team harvested ~1.1 B logit tokens and applied "fusion‑merging" plus DPO alignment, which pushed scores past Arcee‑Nova 2024 and many 40 B‑plus peers on MMLU‑Pro, MATH and HumanEval. With a 128 k context and aggressive quantization options (from BF16 down to 4‑bit GGUF), it balances capability with deployability on single‑GPU nodes. Typical use cases include enterprise chat assistants, technical writing aids and medium‑complexity code drafting where Virtuoso‑Large would be overkill.

Provider	Model ID	Context	Max Output	Input Cost	Output Cost	Throughput	Latency
Together	together	131K	33K	$0.50/M	$0.80/M	59.8 t/s	463 ms

Provider

Model ID

Context

Max Output

Input Cost

Output Cost

Throughput

Latency

Together

together

131K

33K

$0.50/M

$0.80/M

59.8 t/s

463 ms

Arcee AI: Virtuoso Medium V2

131,072 Token Context

Advanced Coding

Agentic Workflows

Available On

Standard Pricing

Do Work. With AI.