Back

Google: Gemma 3 12B

Gemini
Input: text
Input: image
Output: text
Released: Mar 13, 2025Updated: Mar 28, 2025

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling. Gemma 3 12B is the second largest in the family of Gemma 3 models after Gemma 3 27B

131,072 Token Context

Process and analyze large documents and conversations.

Advanced Coding

Improved capabilities in front-end development and full-stack updates.

Agentic Workflows

Autonomously navigate multi-step processes with improved reliability.

Vision Capabilities

Process and understand images alongside text inputs.

Available On

ProviderModel IDContextMax OutputInput CostOutput CostThroughputLatency
DeepInfradeepInfra131K-$0.05/M$0.10/M30.6 t/s1468 ms
Cloudflarecloudflare80K-$0.35/M$0.56/M65.8 t/s972 ms
Standard Pricing
Input Tokens
$0.00000005

per 1K tokens

Output Tokens
$0.0000001

per 1K tokens

Do Work. With AI.