Physical AI, according to Nvidia CEO Jensen Huang, will be the upcoming big thing. AI-powered emerging robots will come in a variety of forms.
Nvidia has been promoting the idea that robots would be ubiquitous in the future. Smart machines will increasingly do repetitive activities in a variety of situations, including the factory, the kitchen, the doctor’s office, and the roadways. It goes without saying that Jensen’s business will supply all of the hardware and AI software required to train and operate the required AIs.
Physical AI: What Is It?
Our current stage of AI, according to Jensen, is pioneering because it is building the foundational models and instruments required to hone them for particular purposes. The subsequent stage, known as corporate AI, is presently in progress, wherein chatbots and AI models are enhancing the efficiency of corporate staff, partners, and clients. Everyone will have a personal AI helper at the end of this phase, or perhaps a group of AIs to help with particular chores.
AI generates the most likely word in a string of words, or tokens, in these two stages to tell us things or demonstrate things to us. Yet physical AI, which Jensen claims is the last third stage, is when the intelligence takes on a form and engages with the outside world. It takes the integration of sensor input with three-space object manipulation to accomplish this effectively.
The CEO and founder of NVIDIA, Jensen Huang, stated that one of the most interesting AI topics to tackle today is creating foundation models for general humanoid robots. Leading roboticists worldwide can now make enormous progress toward artificial general robotics through the convergence of supporting technologies.
Okay, so you must design the robot and its brain. This is clearly a job for artificial intelligence. But how do you put the robot through an infinite number of scenarios, many of which cannot be anticipated or perhaps recreated in the physical world? And how will we regulate it? You guessed it: we’ll use artificial intelligence to model the world the robot will inhabit, as well as the various equipment and creatures with which it will interact.
Three computers are required; one is needed to build the AI. one to act as an AI simulator… and another to run the AI,” Jensen stated.
The Three Computer Problem
Naturally, Jensen is referring to the range of hardware and software solutions offered by Nvidia. Nvidia Omniverse with RTX GPUs is used on workstations and servers to simulate and test the AI and its surroundings. Nvidia Jetsen (soon to be used with Blackwell GPUs) is used to offer on-board real-time sensing and control. The process begins with Nvidia H100 and B100 servers to create the AI.
Moreover, Nvidia unveiled GR00T, or Generalist Robot 00 Technology, which uses human action data to simulate, comprehend, and create movements. For GRooT to explore, adapt, and engage with the actual world, it will need to acquire coordination, dexterity, and other abilities. Huang gave a live demonstration of multiple such robots during his GTC keynote.
NVIDIA Isaac Sim is a reference robotics simulation application built on the NVIDIA Omniverse platform. With the use of two new AI NIMs, roboticists will be able to create simulation workflows for generative physical AI. First, utilizing spatial computing tools like Apple Vision Pro, the MimicGen NIM microservice creates synthetic motion data based on recorded tele-operated data. In OpenUSD, the universal framework supporting Omniverse for creating and collaborating in 3D worlds, the Robocasa NIM microservice creates robot tasks and surroundings ready for simulation.
Lastly, customers may coordinate and scale intricate robotics development workflows across distributed computing resources, whether on-site or in the cloud, with the help of NVIDIA OSMO, a cloud-native managed service.
Robot training and the construction of simulation processes are made easier with OSMO’s assistance, which reduces deployment and development cycle durations from months to less than a week. A variety of operations, including creating synthetic data, training models, performing reinforcement learning, and testing at scale for humanoids, autonomous mobile robots, and industrial manipulators, can be visualized and managed by users.
That being said, how can one create a robot that can grasp objects without hitting them or dropping them? Based on a number of foundation models, Nvidia Isaac Manipulator offers robotic arms cutting-edge dexterity and AI capabilities. Yaskawa, Universal Robots, a Teradyne business, PickNik Robotics, Solomon, READY Robotics, and Franka Robotics are a few of the early ecosystem partners.
Alright, so how can one teach a robot to “see”? In order to increase productivity and worker safety while lowering error rates and expenses, autonomous mobile robots in manufacturing and fulfillment operations are increasingly using Isaac Perceptor’s multi-camera, 3D surround-vision capabilities. ArcBest, BYD, and KION Group are among the early adopters who want to attain unprecedented degrees of autonomy in material handling operations and other areas.
The new Jetson Thor SoC, designed for robot operations, has a Blackwell GPU that is built on a transformer engine and can give 800 teraflops of 8-bit floating point AI capability, which is sufficient to run multimodal generative AI models such as GR00T. It greatly streamlines design and integration efforts with its functional safety processor, high-performance CPU cluster, and 100GB of Ethernet bandwidth.
In conclusion
The Blackwell GPU in the new Jetson Thor SoC, which is based on a transformer engine and can produce 800 teraflops of 8-bit floating point AI performance, is designed to run multimodal generative AI models such as GR00T, for robot operations. Design and integration tasks are greatly streamlined by its functional safety processor, high-performance CPU cluster, and 100GB of Ethernet bandwidth.
Just when you thought it was safe to get back into the water, da dum. Da dum. Da dum. Here come the robots. Jensen believes that robots will need to take on human characteristics because the factories and locations in which they will function were intended for human operators. It is significantly more cost effective to create humane robots than to remodel the factories and environments in which they will operate.
Even if it’s just your kitchen.