Google DeepMind officially launched its next-generation robotic AI model, Gemini Robotics On-Device, marking a milestone in the development of robotic AI technology toward greater efficiency and independence. This model can run locally on robots without an internet connection and demonstrates strong versatility and task adaptability, bringing revolutionary breakthroughs to the fields of industrial, warehouse, and home service robots.

image.png

Breaking Cloud Limitations: Localized Operation of Robotic AI

Gemini Robotics On-Device is a vision-language-action (VLA) model based on Google's Gemini 2.0, with the key feature of running entirely on the robot's local hardware without relying on cloud computing resources. This solves issues of latency and reliability in traditional cloud-based robotic systems under unstable network conditions. Carolina Parada, Senior Director at DeepMind, said: "This model is compact and efficient, capable of running directly on robot hardware, ensuring stable performance with low latency and offline operation."

image.png

By operating locally, Gemini Robotics On-Device significantly enhances the practicality of robots in network-restricted scenarios such as factories, warehouses, or remote areas. Tests show that its performance is close to the cloud-based Gemini Robotics model, while surpassing other local AI models in multiple benchmark tests, demonstrating strong competitiveness.

Universality and Flexibility: From 50 Demonstrations to New Tasks

Gemini Robotics On-Device not only excels in performance but also stands out for its task adaptability. DeepMind claims that the model can quickly adapt to new tasks with just 50 to 100 demonstrations, such as unzipping a zipper, folding clothes, or performing industrial assembly.

The model was initially trained for the ALOHA robot, but it has successfully adapted to the dual-arm Franka FR3 robot and the Apollo humanoid robot from Apptronik, demonstrating its universality across different hardware platforms. Developers can control and fine-tune the model through natural language instructions, enabling it to easily handle complex dual-arm tasks or new objects in dynamic environments. Parada emphasized: "Generative AI allows robots to generalize from limited data, significantly accelerating deployment in complex scenarios."

Open Developer Ecosystem: SDK Empowers Innovation

To accelerate the industry application of Gemini Robotics On-Device, Google DeepMind has simultaneously released a software development kit (SDK), now open for applications through the "Trusted Tester" program on GitHub. Developers can use the SDK to test and fine-tune the model in Google's MuJoCo physical simulator or real-world environments. This move marks DeepMind's first time opening up the ability to fine-tune VLA models to developers, paving the way for customized applications of robotic AI.

The SDK enables developers to quickly train robots to perform specific tasks with just a few demonstrations, such as putting a Rubik's cube into a bag or handling delicate industrial operations. DeepMind stated that the model performs well in new scenes and objects it hasn't seen before, such as completing assembly tasks on an industrial conveyor belt, showing strong generalization capabilities.

Safety and Industry Prospects: The Next Step for Robotic AI

In terms of safety, DeepMind emphasized that Gemini Robotics On-Device includes comprehensive safety measures and collaboration with experts and policymakers to minimize potential risks. At the same time, the release of this model is seen as part of the fierce competition between Google and competitors like Nvidia GR00T and OpenAI RT-2 in the field of general robotic AI.

From warehouse robots to home service robots, the localized operation capability and rapid learning features of Gemini Robotics On-Device lay the foundation for its wide application in various scenarios. AIbase believes that this technology will not only reduce the cost of robot deployment but may also drive AI-driven automation into more daily life scenarios.

Model Access: https://deepmind.google/discover/blog/gemini-robotics-on-device-brings-ai-to-local-robotic-devices/