Google
Google

Gemini Pro Vision 1.0

google/gemini-pro-vision

Google's flagship multimodal model, supporting image and video in text or chat prompts for a text or code response.

See the benchmarks and prompting guidelines from Deepmind.

Usage of Gemini is subject to Google's Gemini Terms of Use.

#multimodal

Capability

Vision Support

Context Window

45,875

Max Output Tokens

2,048

ProviderInput Token PriceOutput Token Price
Google Cloud Vertex$0.13/Million Tokens$0.375/Million Tokens
Google AI Studio$0.13/Million Tokens$0.375/Million Tokens