Top developers are shifting from chatbots to physical AI

0 5 minutes read

Top developers are shifting from chatbots to physical AI

Some of the most ambitious AI researchers and investors are moving beyond chatbots toward “world models” — systems trained to understand space, time, and physical cause-and-effect. The shift is driven by a simple frustration: language models can write and expl

For computer scientist Louis Castricato, the breakthrough moment wasn’t an experiment that worked. It was the feeling, in his eighth year studying large language models, that he had reached a dead end.

“We basically have passed the point of doing real fundamental LLM research,” Castricato said. “Now it’s just applications.”

He quit his doctoral studies at Brown University and started a new company called Overworld. His pitch is blunt and ambitious: AI that can understand and navigate a world, not just words.

The money behind chatbots is still enormous. Investors are counting on AI assistants as they commit trillions of dollars to leading developers like Anthropic and OpenAI. But Castricato’s frustration mirrors a growing faction inside the industry: entrepreneurs and prominent scientists who want AI to move past conversation and toward “world models” — systems that teach AI systems. and sometimes robots. how to react in a physical environment.

Fei-Fei Li, often dubbed the “Godmother of AI,” is one of the best-known advocates for that shift. She described world models as “one of the most important and most overloaded terms in AI today.” Li is also the founder of the San Francisco startup World Labs. and in an essay published this month she laid out why she thinks the idea matters.

At the heart of world model research is the belief that AI can’t be truly intelligent if it can only read a book. It also needs to “read the room.” Li wrote: “Where language models learn the statistical structure of text. world models learn the statistical structure of space and time: how light falls on a surface. how a garden looks from an angle no camera has captured. how objects respond to force and follow the laws of physics.”.

Another high-profile push is coming from Yann LeCun. He quit his job as Meta’s chief AI scientist last year to start Paris-based Advanced Machine Intelligence Labs. On a recent “Unsupervised Learning” podcast. LeCun said. “World model is quickly becoming a buzzword. ” and framed it as something that enables an AI agent “to predict the consequences of its own actions.”.

The problem is that “world model” can mean different things depending on the technology someone hopes to build — robots or more interactive video games, for example — and that disagreement shows up clearly in how proponents describe the field.

Carnegie Mellon dean Martial Hebert puts a fine point on why. Training on all of humanity’s books. news articles. and visual media has produced AI language models that are changing office work and parts of creative fields. But he says those models still hit hard limits when physical action is required.

“Where language models learn the statistical structure of text. ” Li wrote; Hebert argues the real issue is what happens next when a task turns physical. “Chatbots can’t pick up a coffee mug,” Hebert said. “There’s all the geometry of the world. the dynamic of how I move my hand. the physical interaction of the contact with the cup.” He added. “This is much more complex than just predicting the next word in a sentence.”.

For Hebert, who has spent more than four decades researching robotics, world models are most useful as a faster and cheaper path to “physical AI,” another buzzword in the industry.

“Some people may have different definitions. but physical and embodied AI are kind of the evolution of what we used to call robotics. ” Hebert said in an interview. He also pointed to how some of the same advances behind chatbots could be applied to building AI with a broad enough awareness of its environment to work like a robot’s brain.

Hebert compared it to how living bodies adapt. “In your body and spinal cord you have a very general model of how to balance. how to walk around. and you can adapt to your knee hurting in the morning. so you now walk a little differently. ” he said. “You don’t need to think about that. You have a general model somewhere in your nervous system and brain that allows your body to adapt very quickly.”.

That insistence on adaptation — not just generation — is also showing up in where the money is looking.

Castricato’s Overworld, started last year, is betting on simulated worlds as an early proving ground. The tiny Rhode Island-based startup is building video game worlds where a scene, such as “a spooky forest,” can adapt as a virtual character moves through it and interacts with objects.

“There’s no other world model where you can just walk through doors or where you can interact with a detailed environment like this,” Castricato said in an interview. “We optimize for interaction above anything else.”

In the near term, world model makers may not have the same immediate visibility as AI coding tools. Even so, investors are showing up.

Venture capitalists including Steve Jang, co-founder and managing partner at Kindred Ventures, are backing world model-focused companies. Kindred Ventures is investing in Overworld and other companies. including Causal Labs. which is building AI models for weather prediction. and Extropic. which is building specialized computer chips suited to world models.

Jang said, “I think that the future is many different types of models with many different philosophies and architectures.” He added, “I don’t think that it’ll be one large, dense model to rule them all.”

Li, meanwhile, is trying to reduce confusion by building a taxonomy for “world models” in her recent essay. She wrote that a “video model that produces gorgeous but physically impossible flames. a language model improvising a playable game. and a physics engine that faithfully simulates combustion all go by the same name.”.

Li divided world models into three categories: “renderers” that prioritize visual fidelity but can’t be trusted to teach robots much; “simulators” that create virtual training grounds that faithfully represent the physical structure of a world; and “planners” that try to predict what an AI agent or robot should do in an unstructured world.

“A robot that can plan is a robot that can work, and the entire industry is racing to be the one that gets there first,” Li wrote.

Right now, the race isn’t about replacing chatbots overnight. It’s about answering whether today’s systems that excel at language can be extended — or reimagined — to predict consequences, model space and time, and operate when the world pushes back.

world models physical AI chatbots robotics Overworld Fei-Fei Li Yann LeCun Advanced Machine Intelligence Labs World Labs Kindred Ventures Steve Jang Causal Labs Extropic video game worlds weather prediction AI

Sarah Walker 1 hour ago

0 5 minutes read

Leave a Reply Cancel reply