Anyone who studied for college exams understands that having a remarkable memory for facts is not the same as having critical thinking skills.
Large language models (LLMs), which were first made available to the public in 2022, were remarkable but constrained, much like talented children who perform well on multiple-choice tests but falter when asked to defend their reasoning. More like seasoned graduate students, today’s sophisticated reasoning models can deliberately work through problems with a methodical approach, navigating ambiguity and retracing their steps when needed.
Models are evolving from rote memorization to true reasoning as AI systems that learn by imitating the workings of the human brain continue to progress. This capacity opens a new chapter in the development of AI and the benefits it offers businesses. Organizations must, however, make sure they have the necessary computational resources and infrastructure to support the developing technology if they hope to realize this immense potential.
The reasoning revolution
According to Microsoft partner AI/HPC architect Prabhat Ram, reasoning models are fundamentally different than previous LLMs because they can investigate many hypotheses, determine whether responses are consistently accurate, and modify their strategy accordingly. Based on the training data they have been exposed to, they effectively construct an internal representation of a decision tree and investigate which potential solution is the best.
There are trade-offs associated with this flexible approach to problem-solving. In the past, LLMs used probabilistic analysis and statistical pattern matching to produce results in milliseconds. For many applications, this was—and still is—efficient, but it doesn’t give the AI enough time to carefully consider all of the possible solutions.
More recent models enable the AI to use more complex internal reinforcement learning by extending the calculation time during inference to seconds, minutes, or even longer. This makes it possible to solve problems in multiple steps and make more complex decisions.
Ram uses a NASA rover that was sent to investigate the surface of Mars as an example of future applications for AI with reasoning capabilities. “Choices must be made constantly regarding which course to follow, what to investigate, and the trade-off between risk and reward. Is it possible for the AI to determine whether I’m going to jump off a cliff? “Or, is this the one that is more scientifically valuable if I have limited time and money to study this rock?” Achieving these evaluations could lead to scientific breakthroughs at a pace and scale never before possible.
Another significant development in the spread of agentic AI systems is the ability to reason. These are self-governing programs that carry out tasks for users, like making appointments or planning trips. Regardless of whether you’re asking AI to pick up a rock, fold a towel, make a reservation, or summarize a piece of literature, it must first comprehend the surroundings—what we call perception—then proceed to the planning and decision-making stage, says Ram.
Reasoning AI systems have a wide range of enterprise applications
In the healthcare industry, for example, reasoning AI systems could analyze patient data, medical literature, and treatment protocols to support diagnostic or treatment decisions. In scientific research, reasoning models could develop hypotheses, design experimental protocols, and interpret complex results, potentially speeding up discoveries in fields ranging from pharmaceuticals to materials science. In financial analysis, reasoning AI could assist in assessing investment opportunities or market expansion strategies, as well as creating risk profiles or economic forecasts.
With these insights, their personal experience, and emotional intelligence, human financial analysts, researchers, and physicians could make better decisions more quickly. However, regulations and governance structures must be solidified before releasing these systems into the public, especially in high-stakes situations like autonomous cars or healthcare.
According to Ram, a self-driving car must make decisions in real time about whether to turn the steering wheel to the left or the right, use the brake or the gas pedal, and avoid hitting a pedestrian or getting into an accident. Going forward, reasoning models will need to be able to reason through problems and get to a “optimal” decision.
The framework supporting artificial intelligence reasoning
In order to function at their best, reasoning models need a lot more processing power for inference. This has unique scalability problems. It can be difficult to balance load across these many jobs because reasoning models’ inference periods might vary greatly, ranging from a few seconds to several minutes.
Overcoming these challenges needs close collaboration between infrastructure providers and hardware makers, according to Ram, citing Microsoft’s collaboration with NVIDIA, which delivers its accelerated computing platform to Microsoft products such as Azure AI.
“We really have to think about the entire system as a whole when we think about Azure and when we think about deploying systems for AI training and inference,” Ram says. In the data center, what will you do differently? What are your plans regarding several data centers? How will you link them together? From memory errors at the silicon level to transmission errors within and between servers, thermal anomalies, and even data center-level problems like power fluctuations, these factors extend into reliability challenges at all scales, necessitating advanced monitoring and quick response systems.
Through their partnership, Microsoft and NVIDIA have developed a comprehensive system architecture that can adapt to changing AI requirements, enabling businesses to leverage the potential of reasoning models without having to deal with the underlying complexity. Along with improving performance, these kinds of partnerships help businesses stay up with the rapidly changing tech scene. “Velocity is a unique challenge in this space,” Ram asserts. “There is a new base model every three months. The hardware is likewise developing quickly; over the past four years, NVIDIA GPUs of every generation have been used, including the NVIDIA GB200NVL72. In order to lead the field, Microsoft and NVIDIA must work closely together to discuss hardware engineering roadmaps, timetables, and designs, as well as qualifications and validation suites, production-related challenges, and other information.
The introduction of reasoning-capable AI into a wider spectrum of businesses depends on developments in AI infrastructure created especially for reasoning and agentic models. Companies with large computational resources will continue to reap the benefits of reasoning models in the absence of a strong, easily accessible infrastructure.
Even bigger benefits are anticipated when reasoning-capable AI systems and the supporting infrastructure advance. Ram believes that the frontier goes beyond business applications to include scientific discoveries and innovations that advance humankind. He believes that this evolution will be finished when these agentic systems are able to drive scientific inquiry and generate novel theories that could earn a Nobel Prize.