Back
Google: Gemma 3 12B
Gemini
Input: text
Input: image
Output: text
Released: Mar 13, 2025•Updated: Mar 28, 2025
Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling. Gemma 3 12B is the second largest in the family of Gemma 3 models after Gemma 3 27B
131,072 Token Context
Process and analyze large documents and conversations.
Advanced Coding
Improved capabilities in front-end development and full-stack updates.
Agentic Workflows
Autonomously navigate multi-step processes with improved reliability.
Vision Capabilities
Process and understand images alongside text inputs.
Available On
Provider | Model ID | Context | Max Output | Input Cost | Output Cost | Throughput | Latency |
---|---|---|---|---|---|---|---|
DeepInfra | deepInfra | 131K | - | $0.05/M | $0.10/M | 30.6 t/s | 1468 ms |
Cloudflare | cloudflare | 80K | - | $0.35/M | $0.56/M | 65.8 t/s | 972 ms |
Standard Pricing
Input Tokens
$0.00000005
per 1K tokens
Output Tokens
$0.0000001
per 1K tokens