Back
Qwen: Qwen VL Plus
Qwen
Input: text
Input: image
Output: text
Released: Feb 5, 2025•Updated: Mar 28, 2025
Qwen's Enhanced Large Visual Language Model. Significantly upgraded for detailed recognition capabilities and text recognition abilities, supporting ultra-high pixel resolutions up to millions of pixels and extreme aspect ratios for image input. It delivers significant performance across a broad range of visual tasks.
7,500 Token Context
Process and analyze large documents and conversations.
Advanced Coding
Improved capabilities in front-end development and full-stack updates.
Agentic Workflows
Autonomously navigate multi-step processes with improved reliability.
Vision Capabilities
Process and understand images alongside text inputs.
Available On
Provider | Model ID | Context | Max Output | Input Cost | Output Cost | Throughput | Latency |
---|---|---|---|---|---|---|---|
Alibaba | alibaba | 8K | 2K | $0.21/M | $0.63/M | 49.4 t/s | 1146 ms |
Standard Pricing
Input Tokens
$0.00000021
per 1K tokens
Output Tokens
$0.00000063
per 1K tokens
Image Processing
$0.0002688
per image