Qwen: Qwen3 30B A3B
Qwen3, the latest generation in the Qwen large language model series, features both dense and mixture-of-experts (MoE) architectures to excel in reasoning, multilingual support, and advanced agent tasks. Its unique ability to switch seamlessly between a thinking mode for complex reasoning and a non-thinking mode for efficient dialogue ensures versatile, high-quality performance.
Significantly outperforming prior models like QwQ and Qwen2.5, Qwen3 delivers superior mathematics, coding, commonsense reasoning, creative writing, and interactive dialogue capabilities. The Qwen3-30B-A3B variant includes 30.5 billion parameters (3.3 billion activated), 48 layers, 128 experts (8 activated per task), and supports up to 131K token contexts with YaRN, setting a new standard among open-source models.
40,960 Token Context
Process and analyze large documents and conversations.
Hybrid Reasoning
Choose between rapid responses and extended, step-by-step processing for complex tasks.
Advanced Coding
Improved capabilities in front-end development and full-stack updates.
Agentic Workflows
Autonomously navigate multi-step processes with improved reliability.
Available On
Provider | Model ID | Context | Max Output | Input Cost | Output Cost | Throughput | Latency |
---|---|---|---|---|---|---|---|
DeepInfra | deepInfra | 41K | 41K | $0.08/M | $0.29/M | 114.5 t/s | 680 ms |
InferenceNet | inferenceNet | 16K | 16K | $0.08/M | $0.29/M | 16.3 t/s | 844 ms |
Parasail | parasail | 41K | 41K | $0.09/M | $0.50/M | 103.0 t/s | 400 ms |
Nebius | nebiusAiStudio | 41K | - | $0.10/M | $0.30/M | 68.4 t/s | 353 ms |
Novita | novitaAi | 41K | 20K | $0.10/M | $0.45/M | 65.6 t/s | 638 ms |
Fireworks | fireworks | 40K | - | $0.15/M | $0.60/M | 144.6 t/s | 801 ms |
per 1K tokens
per 1K tokens