Back

Qwen: Qwen3 32B

Qwen3
Input: text
Output: text
Released: Apr 28, 2025Updated: May 11, 2025

Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for tasks like math, coding, and logical inference, and a "non-thinking" mode for faster, general-purpose conversation. The model demonstrates strong performance in instruction-following, agent tool use, creative writing, and multilingual tasks across 100+ languages and dialects. It natively handles 32K token contexts and can extend to 131K tokens using YaRN-based scaling.

40,960 Token Context

Process and analyze large documents and conversations.

Hybrid Reasoning

Choose between rapid responses and extended, step-by-step processing for complex tasks.

Advanced Coding

Improved capabilities in front-end development and full-stack updates.

Agentic Workflows

Autonomously navigate multi-step processes with improved reliability.

Available On

ProviderModel IDContextMax OutputInput CostOutput CostThroughputLatency
DeepInfradeepInfra41K-$0.10/M$0.30/M43.2 t/s1041 ms
NebiusnebiusAiStudio41K-$0.10/M$0.30/M42.8 t/s765 ms
Lambdalambda41K41K$0.10/M$0.30/M39.2 t/s541 ms
NovitanovitaAi41K20K$0.10/M$0.45/M27.1 t/s1063 ms
Parasailparasail41K41K$0.10/M$0.50/M41.2 t/s737 ms
GMICloudgmiCloud33K-$0.10/M$0.60/M50.8 t/s1154 ms
NebiusnebiusAiStudio41K-$0.20/M$0.60/M141.5 t/s573 ms
Cerebrascerebras33K-$0.40/M$0.80/M2214.0 t/s455 ms
SambaNovasambaNova33K4K$0.40/M$0.80/M317.0 t/s682 ms
Standard Pricing
Input Tokens
$0.0000001

per 1K tokens

Output Tokens
$0.0000003

per 1K tokens

Do Work. With AI.