This Professor Has a Path Toward Human-Level AI. It Isn ’t LLMs.

11 months ago

Html
Text

Language models powering bots like ChatGPT have no conception of the physical world. Will the next leap forward come from teaching AI sensory common sense?͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Forwarded this email? Subscribe here for more

Big Technology’s independent tech journalism counts on reader support to keep going. Try it for 20% off in year 1, or just $8 per month.

Get 20% off for 1 year

This Professor Has a Path Toward Human-Level AI. It Isn’t LLMs.

Language models powering bots like ChatGPT have no conception of the physical world. Will the next leap forward come from teaching AI sensory common sense?

Alex Kantrowitz

Aug 2

READ IN APP

Deepak Pathak / CMU

In Deepak Pathak’s telling, nobody is building human-level artificial intelligence with language alone. You could train a large language model on billions of descriptions of gravity, but it would never conceptualize it, since it has never experienced the real world. Train it with every physics textbook on earth, and it still can’t visualize what happens when you drop a ball from your hand. With this natural limitation, it hallucinates.

Pathak, a Carnegie Mellon professor and former Meta AI researcher, suggests that today’s leading AI research labs are perhaps too focused on building artificial general intelligence — AI with human-level thought and dexterity — via raw data and compute. To get to ‘AGI,’ he says, the technology has to go pre-verbal, and real world. AI going straight to language is like giving answers to a test without teaching the course. The solutions might tell you something, but your learning is limited.

“The recipe of LLMs is data. Nothing else but data,” Pathak tells me in a video call this week. “Physicality, action, is the base framework for building intelligence.”

So Pathak is trying to help artificial intelligence take its next leap forward by teaching it how to understand the physical world. He’s working to build ‘sensory motor common sense,’ as he calls it, into AI models. The idea is for AI to go out in a natural environment, learn about it on its own, find its way around, and adapt to its surroundings. Think of it as the journey animals first took to language: sensing an environment, then finding ways to move within it. Only with that foundation mastered, does it make sense to verbalize.

Upgrade to paid

To build this common sense into AI, robotics is the natural path forward, but not via the predetermined movements that are most common today (the term ‘acting robotic’ is what Pathak wants to get away from). Pathak is instead dropping robots into totally new environments and giving them nearly limitless opportunities to figure out how to move around and adapt to their surroundings.

Using a form of AI called adaptive reinforcement learning, Pathak initially trains the robots in simulation. He gives them a goal to work toward, and allows them to learn from each failure until they get there on their own. After the robots learn how to move in simulation, their ‘brain’ is transposed into a physical machine and they engage the physical world, building a deeper understanding of how it works.

On his screen, Pathak pulls up a video of a dog-like robot he’s developed, walking up a flight of stairs outside, each matching its height. The robot deftly moves up stairs, adapting in real time to different angles, obstacles, and surface consistencies. “These robots are not just running in the real world, they are continuously adapting,” says Pathak.

Then things get a little crazy. Pathak shows a video of a human opening a drawer, followed by a robot opening the same drawer. By watching how humans use their hands, Pathak’s robots are now able to learn how to do what we do, and do it themselves.

The training method is what Pathak and his colleagues call ‘WHIRL,’ or In-the-Wild Human Imitating Robot Learning. On screen, other robots open refrigerator doors, turn on faucets, close toasters, pick up trash, and even clean a whiteboard. Importantly, this is intelligence is ‘general,’ meaning there’s no need to train new models from the ground up for each task. This is similar to humans, who can open an door and play chess. No AI system is close to being able to operate at that level of generality yet, but perhaps this is a way forward.

Pathak and a Carnegie Mellon colleague Abhinav Gupta last year founded a company called Skild AI to build and commercialize a ‘general purpose brain’ for robots. And last month, the company announced a $300 million funding round from Softbank, Jeff Bezos, and others. Pathak told me he wants to build a robot ‘foundational model’ where you could give a command and it would execute it, no matter the type of robot. It would be “a shared brain that can operate across all kinds of scenarios,” he said.

Imagine that type of ‘common sense,’ intelligence combining with today’s large language models, and you could see AI finally being able to reason about the world we inhabit, predict more effectively, and hallucinate less. “The amount of reasoning you are doing physically, in your body, when you do tasks, is way bigger than what you're saying,” Pathak says. “That’s where the core of intelligence is.”

Advertise with Big Technology?

Reach 140,000+ plugged-in tech readers with your company’s latest campaign, product, or thought leadership. To learn more, write alex@bigtechnology.com or reply to this email.

What Else I’m Reading, Etc.

Early preview of Apple Intelligence, now live in Beta [Washington Post]

Wharton professor Ethan Mollick speaks with the new AI bots [One Useful Thing]

Elon Musk says he’d accept the 2024 election results [The Atlantic]

Divisions at the Pod Save America mothership [Bloomberg]

How the U.S. brought Evan Gershkovich home [WSJ]

Italy’s flag bearer loses wedding ring in France’s Seine River [AP]

This Week on Big Technology Podcast: NVIDIA's Auto Play and the Future of Autonomous Driving — With Danny Shapiro

Danny Shapiro is the Vice President of Automotive at NVIDIA. He joins Big Technology to discuss the current state of autonomous driving technology, its future, and NVIDIA's role. Tune in to hear how NVIDIA is pushing the boundaries of AI and simulation to make self-driving cars safer and more reliable. We also cover the challenges of full autonomy, NVIDIA's broader play in the automotive ecosystem, and how generative AI is transforming the industry. Hit play for an insider's look at the cutting-edge technology shaping the future of transportation.

You can listen on Apple, Spotify, or wherever you get your podcasts.

Thanks again for reading. Please share Big Technology if you like it!

Share Big Technology

And hit that Like Button to train a robot to do it for you

My book Always Day One digs into the tech giants’ inner workings, focusing on automation and culture. I’d be thrilled if you’d give it a read. You can find it here.

Questions? News tips? Email me by responding to this email, or by writing alex@bigtechnology.com Or find me on Signal at 516-695-8680

Thank you for reading Big Technology! Paid subscribers get our weekly column, breaking news insights from a panel of experts, monthly stories from Amazon vet Kristi Coulter, and plenty more. Please consider signing up here.

Upgrade To Paid

Comment

Restack

Share on

Other newsletters from Buzzfeed.com

Microsoft AI CEO Mustafa Suleyman: Why Our AI Diagnostician Outperforms Doctors Buzzfeed.com
Last Thursday at 15:57
Big Technology Turns Five: Here’s What I’ve Lea rned Buzzfeed.com
10 days ago
Why Apple Must Buy Perplexity Buzzfeed.com
17 days ago

More newsletters from Buzzfeed.com

Related newsletters

View other categories