Microsoft’s newest AI chip Maia 200 is three times more powerful

Microsoft has announced its new Maia 200 AI accelerator processor, which is three times more powerful than rival gear from Google and Amazon, according to company spokespeople.

This new chip will be used for AI inference rather than training, powering systems and agents that make predictions, respond to questions, and generate outputs based on new data input into them.

Maia 200 chips are already being deployed in Microsoft’s core data center region in the United States, where they will be used to produce synthetic data and in reinforcement training to develop next-generation large language models (LLMs). The AI accelerator will also power Microsoft Foundry and 365 Copilot AI, as well as the infrastructure available through the company’s Azure cloud platform.

In a blog post, Scott Guthrie, executive vice president of cloud and AI at Microsoft, stated that the new processor has a performance of more than 10 petaflops (1015 floating point operations per second). This is a measure of performance in supercomputing, where the world’s most powerful supercomputers can generate more than 1,000 petaflops of power.

The new processor attained this level of performance in a data representation category known as “4-bit precision (FP4)”—a highly compressed model meant to accelerate AI performance. Maia 200 also provides 5 PFLOPS of performance at 8-bit precision (FP8). FP4 is more energy efficient, but less precise. In practical terms, one Maia 200 node can easily run today’s largest models, with enough of potential for even larger models in the future,” Guthrie wrote in his blog post. “This means Maia 200 delivers 3 times the FP4 performance of the third generation Amazon Trainium, and FP8 performance above Google’s seventh generation TPU.”

Chips ahoy

Maia 200 could eventually be used for specialized AI workloads, such as running larger LLMs. So far, Microsoft’s Maia processors have only been deployed in the Azure cloud infrastructure to power large-scale workloads for Microsoft’s own AI services, most notably Copilot. However, Guthrie stated that there would be “wider customer availability in the future,” implying that other businesses might access Maia 200 via the Azure cloud, or the chips could one day be placed in standalone data centers or server stacks.

Guthrie stated that Microsoft boasts 30% greater performance per dollar than existing systems due to the usage of the 3-nanometer process developed by Taiwan Semiconductor Manufacturing Company (TSMC), the world’s largest fabricator, which allows for 100 billion transistors per chip. This means that Maia 200 may be more cost-effective and efficient for the most demanding AI tasks than current chips.

In addition to improved performance and economy, the Maia 200 includes a few extra benefits. It features a memory system, for example, which can help keep an AI model’s weights and data local, requiring less power to execute the model. It’s also meant to be easily integrated into existing data centers.

Maia 200 should allow AI models to run more quickly and efficiently. This means that Azure OpenAI customers, such as scientists, developers, and organizations, may see increased throughput and speeds when developing AI applications and utilizing tools like GPT-4 in their operations.

Because Maia 200 is built for data centers rather than consumer-grade hardware, it is unlikely to disrupt most people’s daily use of AI and chatbots in the near future. However, end users may notice the impact of Maia 200 in the form of faster response times and possibly more advanced functionality like Copilot and other AI technologies incorporated into Windows and Microsoft products.

Maia 200 may also deliver a performance boost to developers and scientists who employ AI inference on Microsoft platforms. This could lead to better AI deployment on large-scale research initiatives, as well as advanced weather modeling, biological or chemical systems, and compositions.

Source link