Technology

Gemini 2.5 Pro has just won this 29 -year -old popular game, even Sundar Pichai is impressed

Google launched Gemini 2.5 PRO a month ago and says it is the “smartest AI model” to date. During the launch, the technology giant stressed that this model is much better than its competition, including the Openai O3 models, Deepseek R1, Claude and more. While the references (provided by Google) are living proof, a recent victory against a 29 -year -old video game, Pokmon Blue, also added another feather to its ceiling. As it is only Google claims, we wanted to see how good the model was, and here is what we found. But before reading our experience, the question is: why is it to win against a video game an important step for an AI model? Discover.

Gemini 2.5 Pro Pokmon Blue Finishes

For the context, Pokmon Blue (published in 1996) is known for its complex gameplay mechanisms, its strategic fights and its open world explorations – elements that pose important challenges for AI systems. To perform well in the game, an AI must demonstrate capacities such as long -term planning, objectives management and visual navigation – the skills of the search for general artificial intelligence. Now that Gemini 2.5 Pro has won against the complexities of this game, the AI ​​model has proven its title, “smartest model”.

Reacting to this victory, CEO Sundar Pichai went to X (formerly Twitter), saying: “What a finish! Gemini 2.5 Pro has just finished Pokmon Blue!

To clarify, the Gemini play Pokmon Livestream were not launched by Google himself, but by “a 30-year-old software engineer not affiliated with Google” which bears the name of Joel Z. Nevertheless, the leaders of Google showed an enthusiastic support for the project. Logan Kilpatrick, Product Lead for Google AI Studio, shared an update last month, noting that Gemini “made great progress to finish Pokmon” and had “won his 5th badge (the next best model has only 3 so far, but with a different agent harness)”.

During the launch, Google stressed that one of the remarkable improvements in this model lies in its improved coding capacities, which were described as “a big jump on 2.0” with “more improvements to come”. According to Google, “2.5 Pro excels in creating web applications and visually convincing agent code applications, as well as code transformation and publishing.”

In industry benchmarks recognized for agent coding, Gemini 2.5 Pro has delivered a solid performance – recruit 63.8% on Swe -Bench checked using a personalized agent configuration – illuminating its competence in complex software engineering tasks. Now that we compare, the model of Claude d'Anthropic has also been in the race to beat another version of Pokmon, Red. But he has not succeeded so far.

In February, Anthropic presented the progress that his Claude Ai models made in Pokmon Red, noting that the “extended thought and training of the agents” of Claude gave him a “major boost” during the fight against “more unexpected” tasks, as playing a classic video game. Although Claude has made notable progress, he has not yet finished Pokmon Red.

As impressive as it may be, Gemini's performance does not yet report real general intelligence. The developer always lends a helping hand from time to time – to correct the bugs or restrict certain actions, such as the overuse of the exhaust elements. He argues that no direct step -by -step or step -by -step procedure is provided, apart from a punctual case involving a known problem.

It is always an open question if Gemini could manage the same fully alone feat. Nevertheless, its ability to navigate a game as complex as Pokmon Blue – even with a certain support – starts the remarkable potential of models of large languages ​​when deployed in a carefully structured environment.

Posted by:

Unnati Gusain

Posted on:

May 4, 2025

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button