Google Unveils Gemini, a Family of Multimodal AI Models for Text, Images, Video and More

Gemini is a family of multimodal AI models trained on text, images, video, and other data to understand multiple modalities. It includes the flagship Gemini Ultra, Gemini Pro, and Gemini Nano models.
Gemini models aim to perform a wide range of tasks like summarizing text, generating images, translating languages, suggesting replies in messaging apps, and more. Their capabilities depend on where they are implemented.
Gemini Pro is currently available in Bard, Google's conversational AI service, as well as the Vertex AI platform and AI Studio for developers. The Pixel 8 Pro features Gemini Nano.
Gemini appears promising but is still in early stages, with limitations around accuracy and capabilities. Google has made big claims but underdelivered so far.
It remains unclear how Gemini stacks up to competitors like OpenAI's GPT-4, though Google claims benchmark wins. Pricing for some Gemini APIs starts at $0.0025 per 1000 characters.