Google Unveils Gemini 2.5 Pro Preview With Advanced Coding And Video Recognition Capabilities

AI research division of technology company Google, Google DeepMind has announced the early access release of the Gemini 2.5 Pro Preview (I/O edition). This latest version of the Gemini model introduces notable enhancements in coding capabilities, particularly in the development of interactive web applications. 
These updates build on the positive reception of the original Gemini 2.5 Pro’s performance in areas such as coding and multimodal reasoning. In addition to improvements in front-end development, the model now supports more advanced tasks including code transformation, code editing, and the creation of complex, agent-based workflows.
The updated Gemini 2.5 Pro has achieved a leading position on the WebDev Arena Leaderboard, surpassing the previous version by 147 Elo points. This ranking reflects user preferences in evaluating models’ abilities to generate visually appealing and functional web applications.
The model also maintains strong performance in areas such as native multimodal input processing and long-context comprehension. It has demonstrated state-of-the-art results in video understanding, achieving a benchmark score of 84.8% on VideoMME.
Developers can access the updated Gemini 2.5 Pro through the Gemini API on platforms such as Google AI Studio and Vertex AI. It is also integrated into the Gemini app, where it supports features like Canvas and allows users to build interactive web applications with minimal input.
We’re releasing an updated Gemini 2.5 Pro (I/O edition) to make it even better at coding. 

You can build richer web apps, games, simulations and more – all with one prompt.

In @GeminiApp, here's how it transformed images of nature into code to represent unique patterns  pic.twitter.com/IHbKw4EInx
— Google DeepMind (@GoogleDeepMind) May 6, 2025
Gemini 2.5 Pro: What Is It? 
Gemini 2.5 Pro is a highly capable artificial intelligence model created by Google DeepMind, intended for use in complex tasks that demand advanced reasoning and programming functionality. It is designed to work with multiple input formats such as text, code, images, audio, and video, and it can manage up to one million tokens within a single context window. This enables the model to handle large-scale data processing and tackle detailed analytical problems.
The model has shown competitive results in a range of performance evaluations, with particularly strong outcomes in disciplines such as mathematics, software development, and multimodal comprehension.
The post Google Unveils Gemini 2.5 Pro Preview With Advanced Coding And Video Recognition Capabilities appeared first on Metaverse Post.
Google Unveils Gemini 2.5 Pro Preview With Advanced Coding And Video Recognition Capabilities

Utforska mer från Creator

Senaste nytt