Do Services-as-Software

Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series, designed for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for tasks like math, programming, and logical inference, and a "non-thinking" mode for general-purpose conversation. The model is fine-tuned for instruction-following, agent tool use, creative writing, and multilingual tasks across 100+ languages and dialects. It natively handles 32K token contexts and can extend to 131K tokens using YaRN-based scaling.

Provider	Model ID	Context	Max Output	Input Cost	Output Cost	Throughput	Latency
DeepInfra	deepInfra	41K	41K	$0.06/M	$0.24/M	74.8 t/s	1182 ms
Parasail	parasail	41K	41K	$0.06/M	$0.25/M	67.2 t/s	2101 ms
Nebius	nebiusAiStudio	41K	-	$0.08/M	$0.24/M	96.0 t/s	355 ms

Provider

Model ID

Context

Max Output

Input Cost

Output Cost

Throughput

Latency

DeepInfra

deepInfra

41K

$0.06/M

$0.24/M

74.8 t/s

1182 ms

Parasail

parasail

41K

$0.06/M

$0.25/M

67.2 t/s

2101 ms

Nebius

nebiusAiStudio

41K

$0.08/M

$0.24/M

96.0 t/s

355 ms

Qwen: Qwen3 14B

40,960 Token Context

Hybrid Reasoning

Advanced Coding

Agentic Workflows

Available On

Do Work. With AI.