Google: Gemini 1.5 Flash 8B
Gemini Flash 1.5 8B is optimized for speed and efficiency, offering enhanced performance in small prompt tasks like chat, transcription, and translation. With reduced latency, it is highly effective for real-time and large-scale operations. This model focuses on cost-effective solutions while maintaining high-quality results.
Click here to learn more about this model.
Usage of Gemini is subject to Google's Gemini Terms of Use.
1,000,000 Token Context
Process and analyze large documents and conversations.
Advanced Coding
Improved capabilities in front-end development and full-stack updates.
Agentic Workflows
Autonomously navigate multi-step processes with improved reliability.
Vision Capabilities
Process and understand images alongside text inputs.
Available On
Provider | Model ID | Context | Max Output | Input Cost | Output Cost | Throughput | Latency |
---|---|---|---|---|---|---|---|
Google AI Studio | 1000K | 8K | $0.04/M | $0.15/M | 198.7 t/s | 214 ms |
per 1K tokens
per 1K tokens
per 1K tokens
per 1K tokens
prompt threshold
Threshold: 128000
Prompt: $0.000000075 / Completion: $0.0000003 (per 1K tokens)