Top AI Researchers say Language is Limiting

While Big Tech, Anthropic, and OpenAI spend billions on creating cutting-edge large-language models, a small team of AI researchers is on the next big thing.

Yann LeCun, the top AI scientist at Meta, and Fei-Fei Li, the Stanford professor who invented ImageNet, are examples of computer scientists who are creating what they refer to as “world models.”

World models anticipate occurrences based on the mental constructions that individuals form of the world around them, as opposed to large-language models, which decide outputs based on statistical connections between words and phrases in training data.

Li stated in a recent edition of Andreessen Horowitz’s a16z podcast that “language doesn’t exist in nature.” “Humans,” she stated, “not only do we survive, live, and work, but we build civilization beyond language.”

In his 1971 study “Counterintuitive Behavior of Social Systems,” Jay Wright Forrester, a professor at MIT and a computer scientist, outlined the importance of mental models in understanding human behavior.

We all utilize models on a regular basis. In both their personal and professional lives, everyone utilizes models to make decisions out of habit. Models are the mental representations of one’s environment. There are no actual families, companies, cities, governments, or nations in one’s thoughts. Real systems are represented by a few chosen ideas and connections. A model is a mental image. Models are the basis for all judgments. Every legislation is passed using models as a guide. Every executive decision is made using models. Whether to utilize or disregard models is not the question. The question is only a selection of several models.

In order for artificial intelligence (AI) to equal or even exceed human intellect, its proponents think that it must also be able to create mental models.

Li has been pursuing this through World Labs, which she cofounded in 2024 with $230 million in initial funding from venture capital companies Radical Ventures, New Enterprise Associates, and Andreessen Horowitz. According to World Labs’ website, their goal is to elevate AI models from the 2D plane of pixels to whole 3D worlds, both virtual and real, giving them spatial intelligence on par with our own.

Li defined spatial intelligence as “the ability to understand, reason, interact, and generate 3D worlds,” as stated in the No Priors podcast, considering that the world is inherently three-dimensional.

World models, according to Li, have uses in robotics, creative industries, and any other sector that justifies limitless worlds. Similar to Meta, Anduril, and other Silicon Valley titans, that might lead to improvements in military applications by enabling combatants to better sense their environment and predict the next move of their adversaries. The lack of enough data makes world model construction difficult. Humans have evolved and recorded language over ages, but spatial intelligence is less advanced.

She mentioned on the No Priors show that it’s not that simple to draw or create a 3D representation of your surroundings while closing your eyes. Until we receive training, we are not very capable of creating really complex models. In order to collect the data required for these models, “we require more and more sophisticated data engineering, data acquisition, data processing, and data synthesis,” she stated.

This heightens the problem of creating a credible reality.

Yann LeCun, the main AI scientist of Meta, has a small team working on a similar project. To train models and execute simulations that abstract the movies at various levels, the team uses video data.

The essential notion is that you do not make pixel-level predictions. You teach a system to run an abstract version of the video so that you can make predictions in that abstract representation, and presumably this representation will exclude all of the specifics that cannot be anticipated, he explained at the AI Action Summit in Paris earlier this year. This offers a simpler set of building pieces for tracing how the world will evolve at a given point in time.

Like Li, LeCun thinks that the only way to develop highly intelligent AI is through these models. He recently said at the National University of Singapore that we want AI systems that can pick up new tasks very rapidly. They must be able to reason and plan, have a lasting memory, comprehend the actual world—not just text and language—have some common sense, and possess all the other qualities we look for in intelligent beings.

Source link