Qwen: QwQ 32B (free)

Qwen

Input: text

Output: text

Released: Mar 5, 2025•Updated: May 2, 2025

QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems. QwQ-32B is the medium-sized reasoning model, which is capable of achieving competitive performance against state-of-the-art reasoning models, e.g., DeepSeek-R1, o1-mini.

40,000 Token Context

Process and analyze large documents and conversations.

Hybrid Reasoning

Choose between rapid responses and extended, step-by-step processing for complex tasks.

Advanced Coding

Improved capabilities in front-end development and full-stack updates.

Agentic Workflows

Autonomously navigate multi-step processes with improved reliability.

Available On

Provider	Model ID	Context	Max Output	Input Cost	Output Cost	Throughput	Latency
DeepInfra	deepInfra	131K	-	$0.15/M	$0.20/M	48.7 t/s	540 ms
Nebius	nebiusAiStudio	131K	-	$0.15/M	$0.45/M	38.6 t/s	609 ms
InferenceNet	inferenceNet	16K	16K	$0.20/M	$0.20/M	-	-
Groq	groq	131K	131K	$0.29/M	$0.39/M	548.0 t/s	479 ms
Hyperbolic	hyperbolic	131K	-	$0.40/M	$0.40/M	40.1 t/s	1220 ms
SambaNova	sambaNova	16K	4K	$0.50/M	$1.00/M	205.0 t/s	712 ms
Nebius	nebiusAiStudio (fast)	131K	-	$0.50/M	$1.50/M	79.7 t/s	506 ms
Cent-ML	centMl	41K	41K	$0.65/M	$0.65/M	79.7 t/s	482 ms
Fireworks	fireworks	131K	-	$0.90/M	$0.90/M	183.1 t/s	498 ms
Together	together	131K	33K	$1.20/M	$1.20/M	81.2 t/s	561 ms

Standard Pricing

Do Work. With AI.

Join Waitlist Learn more