Gemini just got physical and you should prepare for a robot revolution

Google Gemini Robotics
(Image credit: Google)

  • Gemini Robotics is a new model
  • It focuses on the physical world and will be used by robots
  • It's visual, interactive, and general

Google Gemini is good at many things that happen inside a screen, including generative text and images. Still, the latest model, Google Robotics, is a vision language action model that moves the generative AI into the physical world and could substantially speed up the humanoid robot revolution race.

Gemini Robotics, which Google's DeepMind unveiled on Wednesday, improves Gemini's abilities in three key areas:

  • Dexterity
  • Interactivity
  • Generalization

Each of these three aspects significantly impacts the success of robotics in the workplace and unknown environments.

Generalization allows a robot to take Gemini's vast knowledge about the world and things, apply it to new situations, and accomplish tasks on which it's never been trained. In one video, researchers show a pair of robot arms controlled by Gemini Robotics, a table-top basketball game, and ask it to "slam dunk the basketball."

Even though the robot hadn't seen the game before, it picked up the small orange ball and stuffed it through the plastic net.

Google Gemini Robotics also makes robots more interactive and able to respond not only to changing verbal assignments but also to unpredictable conditions.

In another video, researchers asked the robot to put grapes in a bowl with bananas, but then they moved the bowl around while the robot arm adjusted and still managed to put the grapes in a bowl.

Gemini Robotics: Bringing AI to the physical world - YouTube Gemini Robotics: Bringing AI to the physical world - YouTube
Watch On

Google also demonstrated the robot's dextrous capabilities, which let it tackle things like playing tic-tac-toe on a wooden board, erasing a whiteboard, and folding paper into origami.

Instead of hours of training on each task, the robots respond to near-constant natural language instructions and perform the tasks without guidance. It's impressive to watch.

Naturally, adding AI to robotics is not new.

Last year, OpenAI partnered up with Figure AI to develop a humanoid robot that can work out tasks based on verbal instructions. As with Gemini Robotics, Figure 01's visual language model works with the OpenAI speech model to engage in back-and-forth conversations about tasks and changing priorities.

In the demo, the humanoid robot stands before dishes and a drainer. It's asked about what it sees, which it lists, but then the interlocutor changes tasks and asks for something to eat. Without missing a beat, the robot picks up an Apple and hands it to him.

Google Gemini Robotics

(Image credit: Google)

While most of what Google showed in the videos was disembodied robot arms and hands working through a wide range of physical tasks, there are grander plans. Google is partnering with Apptroniks to add the new model to its Apollo humanoid Robot.

Google will connect the dots with additional programming, a new advanced visual language model called Gemini Robotics-ER (embodied reasoning).

Gemini Robotics-ER will enhance robotics spatial reasoning and should help robot developers connect the models to existing controllers.

Again, this should improve on-the-fly reasoning and make it possible for the robots to quickly figure out how to grasp and use unfamiliar objects. Google calls Gemini Rotbotics ER an end-to-end solution and claims it "can perform all the steps necessary to control a robot right out of the box, including perception, state estimation, spatial understanding, planning and code generation."

Google is providing Gemini robotics -ER model to several business- and research-focused robotics firms, including Boston Dynamics (makers of Atlas), Agile Robots, and Agility Robots.

All-in-all, it's a potential boon for humanoid robotics developers. However, since most of these robots are designed for factories or still in the laboratory, it may be some time before you have a Gemini-enhanced robot in your home.

You might also like

TOPICS
Lance Ulanoff
Editor At Large

A 38-year industry veteran and award-winning journalist, Lance has covered technology since PCs were the size of suitcases and “on line” meant “waiting.” He’s a former Lifewire Editor-in-Chief, Mashable Editor-in-Chief, and, before that, Editor in Chief of PCMag.com and Senior Vice President of Content for Ziff Davis, Inc. He also wrote a popular, weekly tech column for Medium called The Upgrade.

Lance Ulanoff makes frequent appearances on national, international, and local news programs including Live with Kelly and Mark, the Today Show, Good Morning America, CNBC, CNN, and the BBC. 

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.

Read more
Google Gemini AI
Google Gemini's new model is the brainstorming AI partner you've been looking for
Google Gemini AI
Google Gemini is racing to win the AI crown in 2025
Gemini 2.0
Google has just announced the ability to chain actions in Gemini and it could change the way we use AI for good
Gemini 2.0
What is Gemini: everything you need to know about Google’s AI chatbot
a google TV
I got a sneak peek of 3 great new AI features for Google Home devices and TVs, and one is straight out of a Black Mirror episode
Google Gemini AI
Google Gemini is your new smart home butler
Latest in Artificial Intelligence
Perplexity Squid Game Ad
New ad declares Squid Game's real winner is Perplexity AI
Audio Overview in Gemini
Get ready for Audio Overview in Google Gemini, I’ve used it in Notebook LM and it's a complete game changer
Google Gemini Canvas 'Collaborate with Gemini'
Gemini just got a huge writing and coding upgrade - Google keeps making its AI better and ChatGPT should be worried
A couple angry at each other while lying in bed
Should you use ChatGPT to win an argument? I spoke to mental health and relationship experts to find out
Google Gemini AI
Gemini Deep Research is now free - here are 4 ways to get the most out of Google’s awesome AI tool
An iPhone showing the ChatGPT logo on its screen
5 better prompts to use with ChatGPT
Latest in News
Perplexity Squid Game Ad
New ad declares Squid Game's real winner is Perplexity AI
Frank Grimes confronts Homer Simpson in The Simpsons' Homer's Enemy episode
Disney+ adds a new continuous Simpsons stream, so you no longer have to spend ages choosing an episode
Helly and Mark standing on an artificial hill surrounded by goats in Severance season 2 episode 3
New Apple teaser for Severance season 2 finale suggests we might finally find out what Lumon is doing with those goats, and I don't think it's anything good
Foldable iPhone
Apple’s first foldable iPhone could beat the Samsung Galaxy Z Fold 7 in one key way
Marvel Rivals
Marvel Rivals' next update will add two new hero skins for Iron Man and Spider-Man mains this week
Nvidia Isaac GROOT N1
“The age of generalist robotics is here" - Nvidia's latest GROOT AI model just took us another step closer to fully humanoid robots