Back

Qwen: Qwen VL Plus

Qwen
Input: text
Input: image
Output: text
Released: Feb 5, 2025Updated: Mar 28, 2025

Qwen's Enhanced Large Visual Language Model. Significantly upgraded for detailed recognition capabilities and text recognition abilities, supporting ultra-high pixel resolutions up to millions of pixels and extreme aspect ratios for image input. It delivers significant performance across a broad range of visual tasks.

7,500 Token Context

Process and analyze large documents and conversations.

Advanced Coding

Improved capabilities in front-end development and full-stack updates.

Agentic Workflows

Autonomously navigate multi-step processes with improved reliability.

Vision Capabilities

Process and understand images alongside text inputs.

Available On

ProviderModel IDContextMax OutputInput CostOutput CostThroughputLatency
Alibabaalibaba8K2K$0.21/M$0.63/M49.4 t/s1146 ms
Standard Pricing
Input Tokens
$0.00000021

per 1K tokens

Output Tokens
$0.00000063

per 1K tokens

Image Processing
$0.0002688

per image

Do Work. With AI.