Google: Gemini 2.5 Flash Preview 04-17
Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks. It includes built-in "thinking" capabilities, enabling it to provide responses with greater accuracy and nuanced context handling.
Note: This model is available in two variants: thinking and non-thinking. The output pricing varies significantly depending on whether the thinking capability is active. If you select the standard variant (without the ":thinking" suffix), the model will explicitly avoid generating thinking tokens.
To utilize the thinking capability and receive thinking tokens, you must choose the ":thinking" variant, which will then incur the higher thinking-output pricing.
Additionally, Gemini 2.5 Flash is configurable through the "max tokens for reasoning" parameter, as described in the documentation (https://openrouter.ai/docs/use-cases/reasoning-tokens#max-tokens-for-reasoning).
1,048,576 Token Context
Process and analyze large documents and conversations.
Hybrid Reasoning
Choose between rapid responses and extended, step-by-step processing for complex tasks.
Advanced Coding
Improved capabilities in front-end development and full-stack updates.
Agentic Workflows
Autonomously navigate multi-step processes with improved reliability.
Vision Capabilities
Process and understand images alongside text inputs.
Available On
Provider | Model ID | Context | Max Output | Input Cost | Output Cost | Throughput | Latency |
---|---|---|---|---|---|---|---|
vertexNonThinking | 1049K | 66K | $0.15/M | $0.60/M | 85.7 t/s | 691 ms | |
Google AI Studio | aiStudioNonThinking | 1049K | 66K | $0.15/M | $0.60/M | 142.8 t/s | 2003 ms |
per 1K tokens
per 1K tokens
per image
per 1K tokens
per 1K tokens