Back

Qwen: Qwen3 32B

Qwen3
Input: text
Output: text
Released: Apr 28, 2025Updated: May 11, 2025

Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for tasks like math, coding, and logical inference, and a "non-thinking" mode for faster, general-purpose conversation. The model demonstrates strong performance in instruction-following, agent tool use, creative writing, and multilingual tasks across 100+ languages and dialects. It natively handles 32K token contexts and can extend to 131K tokens using YaRN-based scaling.

40,960 Token Context

Process and analyze large documents and conversations.

Hybrid Reasoning

Choose between rapid responses and extended, step-by-step processing for complex tasks.

Advanced Coding

Improved capabilities in front-end development and full-stack updates.

Agentic Workflows

Autonomously navigate multi-step processes with improved reliability.

Available On

ProviderModel IDContextMax OutputInput CostOutput CostThroughputLatency
DeepInfradeepInfra41K-$0.10/M$0.30/M45.1 t/s869 ms
Nebius AI StudionebiusAiStudio41K-$0.10/M$0.30/M40.0 t/s752 ms
NovitaAInovitaAi41K41K$0.10/M$0.45/M26.3 t/s1674 ms
Parasailparasail41K41K$0.10/M$0.50/M48.2 t/s929 ms
GMICloudgmiCloud33K-$0.30/M$0.60/M55.1 t/s918 ms
Cerebrascerebras33K-$0.40/M$0.80/M2769.4 t/s589 ms
SambaNovasambaNova8K4K$0.40/M$0.80/M321.1 t/s1112 ms
Standard Pricing
Input Tokens
$0.0000001

per 1K tokens

Output Tokens
$0.0000003

per 1K tokens

Do Work. With AI.