In Silicon Valley, it’s hard to travel far without running into a “AI for X” startup. AI in business technology. AI in healthcare. Use AI in dating. And so forth.
It’s undeniable that some of these startups are just marketing puffery. The majority of the others, however, are merely licensing large AI systems from well-funded startups and tech giants, like Google’s Bard, Anthropic’s Claude, and OpenAI’s ChatGPT, and applying them to whatever human endeavor their founders believe hasn’t received enough AI attention to date.
These startups’ and services’ sudden ubiquity indicates that the AIs they are using are primed for success. However, they’re not in a lot of ways. Not yet, at least. The underlying AIs that underlie all of this hype are improving quickly, which is good news for enthusiasts of AI, anyway. This implies that the buzz of today may soon give way to reality tomorrow.
It will take some intellectual exploration for us to fully comprehend all of this, including why AIs aren’t ready for prime time, how they are improving, and what that can tell us about where we’re heading.
It’s helpful to first comprehend how these AIs operate. It’s essential to understand two terms: foundation models and “generative AI.” Generative artificial intelligence (AI) refers to the current generation of AIs that are generating such excitement among people because they are doing things that, until a few years ago, seemed only humans could do. They are built upon foundation models, which are massive systems trained on massive datasets—often terabytes of data representing all publicly accessible information on the internet.
AIs that can create startlingly realistic images, artificial voices that sound exactly like the humans they mimic, and responses to written prompts that are uncannily humanlike are known as generative AIs.
Comparing these AIs to other revolutionary technologies in their early stages of development is the best way to grasp where these technologies might take us and why predictions are only partially useful. Consider the steam machine. Early eighteenth-century inventors Thomas Savery and Thomas Newcomen created crude steam-powered pumps to remove water from mines; no one could have predicted that these early pumps would one day develop into highly efficient steam turbines, which are necessary to produce electricity. (To start with, electricity wasn’t yet discovered.)
George Musser is the author of a book that will be published soon about how scientists are developing novel approaches to investigate the nature of human and machine intelligence. He claims that the first steam engines were the result of trial and error rather than a thorough grasp of the science of thermodynamics.
The history of technology has repeatedly followed this pattern: first, there was the thing (the steam engine in this case), and only afterwards did we understand it. That knowledge, which we refer to as thermodynamics, developed into one of the fields of physics with the broadest applicability.
Yes, it’s happening once more. According to Musser, today’s AIs are the result of experimentation and intuition, and their operation is still unknown, providing an almost perfect summary of that history. However, generative AIs of today carry within them the seeds of countless future applications, just like the first steam engines did. The key to unlocking those applications will be something we’re still learning about: a comprehension of the inner workings of generative AI and foundation models.
In order to achieve this, engineers, physicists, computer scientists, mathematicians, and neuroscientists are collaborating to establish a new field of study called the universal science of machine intelligence. And as they work on it, we’re learning helpful things about what AIs could be able to do in the future.
It is the belief of certain researchers, for example, that a certain type of foundation model is already capable of what is effectively reasoning.
Large language model is the third term we need to introduce at this point. One example of generative AI is a large language model, which belongs to the class of foundation models that are trained only on text. (The new chatbots from Meta, Bard, and ChatGPT are a few examples.)
Debate surrounds whether large language models have progressed beyond simple memorization and recitation of information to fully novel synthesis of that information, or reasoning about it.
When given sufficient information, today’s large language models can handle challenging tasks, according to Blaise Aguera y Arcas, an AI researcher at Google Research, demonstrating their reasoning ability. For instance, even though, say, the product of two four-digit numbers is nowhere in the training data, it is possible to use appropriate coaxing to get a large language model to correctly respond to simple mathematical questions.
“There is no other way to get that right—figuring that out means having to have learned what the multiplication algorithm actually is,” claims Aguera y Arcas.
Some researchers believe that Aguera y Arcas exaggerates the degree of reasoning that is possible with large language models available today. Some of what people believe to be reasoning by large language models may simply be things they have memorized, according to Sarah Hooker, director of Cohere for AI, the nonprofit research arm of the AI company Cohere. This may help to explain why these models become larger and more capable—not because learning a language makes them more capable of reasoning.
We simply don’t know what’s in our pretraining data, which accounts for a large portion of the mystery, according to Hooker. First off, a lot of AI firms are no longer disclosing the contents of their pretraining data. Furthermore, the size of these pretraining data sets—imagine all the text accessible on the open web—means that it can be challenging to determine whether an answer provided to an AI trained on them is merely a coincidence that it already exists in the vast amount of data.
Sayash Kapoor, a third-year Ph.D. student at Princeton who studies and writes about the limitations of modern AIs, says that, in any event, there is now ample evidence that these large language models are capable of some form of reasoning, however basic by human standards. He continues, But there is also evidence that memorization in these models is frequently leading to performance claims that may be exaggerated.
In any event, Sayash Kapoor, a third-year Princeton Ph.D. candidate who studies and writes about the limitations of current AIs, claims that there is now a wealth of evidence demonstrating these large language models are capable of reasoning, albeit in a primitive way by human standards. However, there’s also proof that memorizing within these models frequently results in potentially exaggerated performance claims, he continues.
Here’s the advantage if you’ve made it this far: The capabilities of generative AIs could advance quickly over the course of years if today’s large language models are able to reason in any way, no matter how basic.
That’s partly because, unlike images or sounds, language is more than just a means of communication. It is a technology that humans have developed to describe every aspect of the world and its relationships. Aguera y Arcas claims that language allows us to construct models of the world even in the absence of other stimuli, such as vision or hearing. Because of this, he claims, a large language model that has never “seen” either of the two colours can write about the relationship between them with fluency.
Furthermore, language serves as the interface for a plethora of other online systems, like search engines, that were created with human use in mind but can be modified by these generative AIs.
All these observations about large language models add up to the possibility that we will soon have AI-based personal assistants that are entirely tailored to our personal data, for instance. As long as your emails, calendar entries, and documents are already stored in Google’s system, the company can search and synthesize across them. Google is already making an effort with this, but the initial version is still rudimentary and prone to inaccuracy.
According to Aguera y Arcas, these kinds of systems may be better equipped in the not-too-distant future to adapt to our personal data, much like humans are constantly creating new memories. This could improve the ability of future AI assistants to tailor their responses to each of us within, say, two to five years.
Aguera y Arcas responded that while he couldn’t comment on any upcoming Google products, the development of hyper-personalized AI assistants is “a very obvious implication” given the state of current AI technology.
Another implication is that in the future, artificial intelligence (AI) will be able to acquire new skills in a manner akin to that of humans by providing it with access to cloud-based software meant for humans that provides those services.
Providing chat-based AIs with access to search engines like Google is the most basic illustration of this. Of course, there are a lot more search engines on the internet than just Google; there are repositories for academic papers, code, court cases, and much more.
“Plug-ins” are one method generative AIs are integrated into services designed primarily for human use. For instance, ChatGPT plug-ins allow users to access travel search services Kayak and Expedia as well as shopping services Instacart and Shop.
Large language models require these plug-ins because, although they have been trained on enormous amounts of data, they may not have access to information that cannot be scraped from the internet; they are only as current as the corpus of data they were last trained on; and even with all of that data inside of them, they may still have difficulty with certain types of reasoning, such as those found in mathematics.
The true promise of future iterations of “AI for X” services and startups is revealed when large language models are granted access to the same kinds of resources that people already have. These startups are able to begin integrating a wide range of additional data and services, as opposed to merely providing access to a rebranded and licensed version of an established foundation model. For example, “AI for legal advice” could incorporate databases of court rulings, while “AI for diagnoses” could access databases of medical literature. In contrast to the frequently erroneous and made-up responses that people are currently capable of, these systems would use the primitive reasoning ability of large language models to provide answers to questions that are far more reliable.
It is as hard to predict what the world will look like when we all have these new kinds of cognitive aids as it would have been from the standpoint of those early steam engine builders when they first built cars, trains, jet planes, and rockets.
Furthermore, there are still a lot of obstacles in the way of the idealized state of plain-language interfaces for AI assistants that can access the internet’s superpowers on our behalf. One of them is the expense of operating the generative AIs of today; this must decrease before hundreds of millions of us can have ongoing conversations with our AI assistants of the future, instead of just early adopters asking specific questions once in a while.
According to Douwe Kiela, CEO of Contextual AI, even near-future systems that combine massive language models with specialized systems to improve them at specific tasks are akin to “Frankenstein’s monsters.” This presents another obstacle. It may take many years of continuous improvements, during which engineers optimise each component of these systems to function as a whole while eliminating components that don’t benefit the customer, to solve the cost problems that arise in such cobbled-together systems and make them more useful.
More than a century passed between the steam engine’s invention and the locomotive’s introduction. Concurrently, a novel field of study emerged, which eventually served as the catalyst for numerous additional breakthroughs that were crucial to the Industrial Revolution. If this pattern of development of generative AIs is any indication, the near future of this field will involve revolutionary inventions (AIs that are truly personal assistants, experts in various subjects), years of refinement, frantic attempts to harness and profit from these new technologies, and perhaps even a second Industrial Revolution. However, this revolution will be built on the manipulation of data and insight, not on energy and matter.
Our imaginations are limited to what that might entail.