Google: Gemma 3 12B

Gemini

Input: text

Input: image

Output: text

Released: Mar 13, 2025•Updated: Mar 28, 2025

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling. Gemma 3 12B is the second largest in the family of Gemma 3 models after Gemma 3 27B

131,072 Token Context

Process and analyze large documents and conversations.

Advanced Coding

Improved capabilities in front-end development and full-stack updates.

Agentic Workflows

Autonomously navigate multi-step processes with improved reliability.

Vision Capabilities

Process and understand images alongside text inputs.

Available On

Provider	Model ID	Context	Max Output	Input Cost	Output Cost	Throughput	Latency
DeepInfra	deepInfra	131K	-	$0.05/M	$0.10/M	20.3 t/s	1021 ms
Cloudflare	cloudflare	80K	-	$0.35/M	$0.56/M	58.5 t/s	347 ms

Standard Pricing

Input Tokens

per 1M tokens

$0.05

Output Tokens

per 1M tokens

$0.10

Do Work. With AI.

Join Waitlist Learn more