Back

xAI: Grok Vision Beta

Grok
Input: text
Input: image
Output: text
Released: Nov 19, 2024Updated: Mar 28, 2025

Grok Vision Beta is xAI's experimental language model with vision capability.

8,192 Token Context

Process and analyze large documents and conversations.

Advanced Coding

Improved capabilities in front-end development and full-stack updates.

Agentic Workflows

Autonomously navigate multi-step processes with improved reliability.

Vision Capabilities

Process and understand images alongside text inputs.

Available On

ProviderModel IDContextMax OutputInput CostOutput CostThroughputLatency
xAIxAi8K-$5.00/M$15.00/M61.6 t/s403 ms

Standard Pricing

Input Tokens

per 1M tokens

$5.00

Output Tokens

per 1M tokens

$15.00

Image Processing

per image

$0.009

Do Work. With AI.