Qwen: Qwen3 235B A22B
Qwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by Qwen, activating 22B parameters per forward pass. It supports seamless switching between a "thinking" mode for complex reasoning, math, and code tasks, and a "non-thinking" mode for general conversational efficiency. The model demonstrates strong reasoning ability, multilingual support (100+ languages and dialects), advanced instruction-following, and agent tool-calling capabilities. It natively handles a 32K token context window and extends up to 131K tokens using YaRN-based scaling.
40,960 Token Context
Process and analyze large documents and conversations.
Hybrid Reasoning
Choose between rapid responses and extended, step-by-step processing for complex tasks.
Advanced Coding
Improved capabilities in front-end development and full-stack updates.
Agentic Workflows
Autonomously navigate multi-step processes with improved reliability.
Available On
Provider | Model ID | Context | Max Output | Input Cost | Output Cost | Throughput | Latency |
---|---|---|---|---|---|---|---|
DeepInfra | deepInfra | 41K | 41K | $0.14/M | $0.60/M | 25.2 t/s | 964 ms |
kluster.ai | klusterAi | 41K | 41K | $0.14/M | $2.00/M | 25.1 t/s | 989 ms |
Parasail | parasail | 41K | 41K | $0.18/M | $0.85/M | 48.0 t/s | 581 ms |
Together | together | 41K | - | $0.20/M | $0.60/M | 31.6 t/s | 932 ms |
Nebius AI Studio | nebiusAiStudio | 41K | - | $0.20/M | $0.60/M | 25.2 t/s | 598 ms |
NovitaAI | novitaAi | 41K | 41K | $0.20/M | $0.80/M | 22.8 t/s | 983 ms |
Fireworks | fireworks | 128K | - | $0.22/M | $0.88/M | 57.5 t/s | 996 ms |
GMICloud | gmiCloud | 33K | - | $0.25/M | $1.09/M | 44.9 t/s | 3103 ms |
per 1K tokens
per 1K tokens