Google: Gemini 2.5 Flash Preview 04-17 (thinking)
Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks. It includes built-in "thinking" capabilities, enabling it to provide responses with greater accuracy and nuanced context handling.
Note: This model is available in two variants: thinking and non-thinking. The output pricing varies significantly depending on whether the thinking capability is active. If you select the standard variant (without the ":thinking" suffix), the model will explicitly avoid generating thinking tokens.
To utilize the thinking capability and receive thinking tokens, you must choose the ":thinking" variant, which will then incur the higher thinking-output pricing.
Additionally, Gemini 2.5 Flash is configurable through the "max tokens for reasoning" parameter, as described in the documentation (https://openrouter.ai/docs/use-cases/reasoning-tokens#max-tokens-for-reasoning).
1,048,576 Token Context
Process and analyze large documents and conversations.
Hybrid Reasoning
Choose between rapid responses and extended, step-by-step processing for complex tasks.
Advanced Coding
Improved capabilities in front-end development and full-stack updates.
Agentic Workflows
Autonomously navigate multi-step processes with improved reliability.
Vision Capabilities
Process and understand images alongside text inputs.
Available On
Provider | Model ID | Context | Max Output | Input Cost | Output Cost | Throughput | Latency |
---|---|---|---|---|---|---|---|
vertexThinking | 1049K | 66K | $0.15/M | $3.50/M | 120.8 t/s | 1721 ms | |
Google AI Studio | aiStudioThinking | 1049K | 66K | $0.15/M | $3.50/M | 180.5 t/s | 1404 ms |
Standard Pricing
Input Tokens
per 1M tokens
$0.15
Output Tokens
per 1M tokens
$3.50
Image Processing
per image
$0.0006192
Input Cache Read
per 1M tokens
$0.04
Input Cache Write
per 1M tokens
$0.23