Do Services-as-Software

Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks. It includes built-in "thinking" capabilities, enabling it to provide responses with greater accuracy and nuanced context handling.

Note: This model is available in two variants: thinking and non-thinking. The output pricing varies significantly depending on whether the thinking capability is active. If you select the standard variant (without the ":thinking" suffix), the model will explicitly avoid generating thinking tokens.

To utilize the thinking capability and receive thinking tokens, you must choose the ":thinking" variant, which will then incur the higher thinking-output pricing.

Additionally, Gemini 2.5 Flash is configurable through the "max tokens for reasoning" parameter, as described in the documentation (https://openrouter.ai/docs/use-cases/reasoning-tokens#max-tokens-for-reasoning).

Provider	Model ID	Context	Max Output	Input Cost	Output Cost	Throughput	Latency
Google	vertexNonThinking	1049K	66K	$0.15/M	$0.60/M	85.7 t/s	661 ms
Google AI Studio	aiStudioNonThinking	1049K	66K	$0.15/M	$0.60/M	139.3 t/s	538 ms

Provider

Model ID

Context

Max Output

Input Cost

Output Cost

Throughput

Latency

Google

vertexNonThinking

1049K

66K

$0.15/M

$0.60/M

85.7 t/s

661 ms

Google AI Studio

aiStudioNonThinking

1049K

66K

$0.15/M

$0.60/M

139.3 t/s

538 ms

Google: Gemini 2.5 Flash Preview 04-17

1,048,576 Token Context

Hybrid Reasoning

Advanced Coding

Agentic Workflows

Vision Capabilities

Available On

Do Work. With AI.