Google: Gemini 1.5 Flash 8B

Gemini

Input: text

Input: image

Output: text

Released: Oct 3, 2024•Updated: Apr 4, 2025

Gemini Flash 1.5 8B is optimized for speed and efficiency, offering enhanced performance in small prompt tasks like chat, transcription, and translation. With reduced latency, it is highly effective for real-time and large-scale operations. This model focuses on cost-effective solutions while maintaining high-quality results.

Click here to learn more about this model.

Usage of Gemini is subject to Google's Gemini Terms of Use.

1,000,000 Token Context

Process and analyze large documents and conversations.

Advanced Coding

Improved capabilities in front-end development and full-stack updates.

Agentic Workflows

Autonomously navigate multi-step processes with improved reliability.

Vision Capabilities

Process and understand images alongside text inputs.

Available On

Provider	Model ID	Context	Max Output	Input Cost	Output Cost	Throughput	Latency
Google AI Studio	google	1000K	8K	$0.04/M	$0.15/M	198.7 t/s	214 ms

Standard Pricing

Input Tokens

$0.0000000375

per 1K tokens

Output Tokens

$0.00000015

per 1K tokens

Input Cache Read

$0.00000001

per 1K tokens

Input Cache Write

$0.0000000583

per 1K tokens

Variable Pricing Tiers

prompt threshold

Threshold: 128000

Prompt: $0.000000075 / Completion: $0.0000003 (per 1K tokens)

Do Work. With AI.

Join Waitlist Learn more