Back

xAI: Grok 2 Vision 1212

Grok
Input: text
Input: image
Output: text
Released: Dec 15, 2024Updated: Mar 28, 2025

Grok 2 Vision 1212 advances image-based AI with stronger visual comprehension, refined instruction-following, and multilingual support. From object recognition to style analysis, it empowers developers to build more intuitive, visually aware applications. Its enhanced steerability and reasoning establish a robust foundation for next-generation image solutions.

To read more about this model, check out xAI's announcement.

32,768 Token Context

Process and analyze large documents and conversations.

Advanced Coding

Improved capabilities in front-end development and full-stack updates.

Agentic Workflows

Autonomously navigate multi-step processes with improved reliability.

Vision Capabilities

Process and understand images alongside text inputs.

Available On

ProviderModel IDContextMax OutputInput CostOutput CostThroughputLatency
xAIxAi33K-$2.00/M$10.00/M51.4 t/s899 ms
Standard Pricing
Input Tokens
$0.000002

per 1K tokens

Output Tokens
$0.00001

per 1K tokens

Image Processing
$0.0036

per image

Do Work. With AI.