In this emerging world of artificial intelligence we’ve been in for the past year or so, Google DeepMind has always been a big part of the conversation. Though they were caught a bit flat-footed this time last year, there was never any doubt that Google had some big guns in the AI race that would eventually surface. And today, with the introduction of Gemini to the world, we’re witnessing a significant leap for Google in the AI race. I think it’s fair to say that things are really starting to get interesting now.
A new definition of AI models
Gemini’s biggest, standout trait is the fact that it is a multimodal AI model, a significant departure from traditional single-mode AI systems. This new model is capable of processing and integrating diverse types of information — from text and code to audio, images, and videos. This ability makes Gemini an extraordinary tool, bridging gaps that previous AI models couldn’t. Understanding the need for AI solutions that cater to different requirements, Google has developed three versions of Gemini:
- Gemini Ultra: The most extensive and capable variant, designed for complex tasks.
- Gemini Pro: A versatile model suitable for a wide array of applications.
- Gemini Nano: Optimized for efficiency, this version is ideal for on-device integration.
Gemini Ultra’s performance exceeds current state-of-the-art results on 30 of the 32 widely-used academic benchmarks used in large language model (LLM) research and development.
With a score of 90.0%, Gemini Ultra is the first model to outperform human experts on MMLU (massive multitask language understanding), which uses a combination of 57 subjects such as math, physics, history, law, medicine and ethics for testing both world knowledge and problem-solving abilities.via The Keyword
What really separates Gemini from its predecessors is its multimodal capability. Unlike traditional AI models that combine separately-trained components for different modalities, Gemini is pre-trained across various data types. This integrated approach allows it to process and comprehend multiple forms of input seamlessly, greatly increasing its reasoning and understanding abilities.
Gemini will be everywhere
It should come as no surprise that Gemini is being integrated into various Google products and platforms, starting today. First, Bard will begin leveraging Gemini Pro for advanced reasoning and understanding capabilities as early as today and will be – according to Google – the biggest upgrade to Bard since it first launched.
Gemini will also be coming to Pixel 8 Pro (and other Pixel phones later). Google says the Pixel 8 Pro is the first phone built to run Gemini Nano on-device to power new features like the Recorder app’s new Summarize feature and Gboard’s new Smart Replies.
In the next few months, we expect to see Gemini appearing in Search, Ads, Chrome, and Duet AI (mostly Workspace apps), and Google says that early experiments with Gemini in Search are already delivering a 40% reduction in latency with Search Generative Experiences.
The Gemini Era
With this launch, Google is entering a new era in AI and in their products as a whole. How Gemini will impact everything from app capabilities to search to content creation is anyone’s guess, but there’s no question around the power Gemini seems to possess. Google’s largely taken the slower, more-methodical approach to these more-powerful AI models, and though they are all about putting safety and responsibility in this area first, all of this is a bit scary when you realize the scale and scope of what is happening. I’m sure we’ll be talking about Gemini quite a bit in the coming weeks, months and years, and I have a feeling today’s launch will be a formative one for the future of Google.