Microsoft releases a custom AI chip that could be a compete to Nvidia

At its Ignite conference on Wednesday in Seattle, Microsoft revealed two chips.

The first, with its Maia 100 artificial intelligence chip, might be able to compete with the highly sought-after AI graphics processing units from Nvidia. Aiming for broad computing workloads, the second device, the Cobalt 100 Arm chip, may pose a threat to Intel processors.

Financially robust tech firms have started offering their customers more options for cloud infrastructure that they may utilise for running apps. This has long been the practise of Google, Amazon, and Alibaba. Microsoft had a 21.5% cloud market share in 2022, second only to Amazon, based on estimates, with over $144 billion in cash at the end of October.

In 2024, Cobalt-powered virtual machine instances will be made available for purchase through Microsoft’s Azure cloud, according to corporate vice president Rani Borkar, who spoke with CNBC. She didn’t say when the Maia 100 would be released.

In 2016, Google revealed their first AI tensor processing unit. In 2018, Amazon Web Services unveiled the Inferentia AI processor and the Graviton Arm-based CPU. In 2020, the company announced Trainium, a model-training tool.

When there is a shortage of GPUs, cloud providers may be able to help supply demand using special AI processors. However, unlike Nvidia or AMD, Microsoft and its cloud computing peers do not intend to allow businesses to purchase servers outfitted with their CPUs.

According to Borkar, the company developed its AI chip based on input from customers.

According to Borkar, Microsoft is evaluating Maia 100’s ability to support GPT-3.5-Turbo, a sizable language model from Microsoft-backed OpenAI, the GitHub Copilot coding assistant, and its AI chatbot for the Bing search engine (now called Copilot instead of Bing Chat). Large amounts of internet data have been put into OpenAI’s language models, enabling them to produce email messages, summarise documents, and provide answers to queries with just a few words of human instruction.

Working with OpenAI’s ChatGPT helper, which was made accessible last year, is the GPT-3.5-Turbo model. Subsequently, companies moved fast and integrated comparable chat features into their software, which raised the need for GPUs.

Nvidia’s finance head, Colette Kress, stated in September at an Evercore conference in New York, “We’ve been working across the board and [with] all of our different suppliers to help improve our supply position and support many of our customers and the demand that they’ve put in front of us.”

on the past, OpenAI used Nvidia GPUs on Azure to train models.

Microsoft has created specially designed liquid-cooled hardware known as Sidekicks, which fit in racks directly next to racks holding Maia servers, in addition to creating the Maia CPU. According to a spokesperson, retrofitting is not necessary for the company to install the Sidekick and server racks.

Making the most of the limited space in a data centre can be difficult when using GPUs. According to Steve Tuck, co-founder and CEO of server startup Oxide Computer, companies would occasionally place a few GPU-equipped servers, dubbed “orphans,” towards the bottom of a rack to prevent overheating instead of filling the rack from top to bottom. To lower temperatures, businesses occasionally install cooling systems, according to Tuck.

If Microsoft follows Amazon’s lead, Cobalt processors may be adopted more quickly than Maia AI chips. Microsoft is using Cobalt to test the Azure SQL Database service and the Teams app. According to Microsoft, they have so far outperformed Azure’s current Arm-based chips, which are made by startup Ampere, by 40%.

For AWS users, Graviton has been one of the ways that many businesses have looked for ways to make their cloud spending more efficient over the past 18 months as costs and interest rates have risen. Vice President Dave Brown stated that the Arm-based chips, which can result in a 40% price-performance gain, are currently being used by all of AWS’s top 100 customers.

Moving to AWS from GPUs However, transitioning from Intel Xeons to Gravitons can be less challenging than using Trainium AI chips. Every AI model is different. Because Arm is widely used in mobile devices, a lot of work has gone into making a range of tools function on it. However, Brown said that this is less the case in silicon for AI. However, he stated that he would anticipate that organisations would have comparable price-performance advantages with Trainium when compared to GPUs over time.

“We have shared these specs with the ecosystem and with a lot of our partners in the ecosystem, which benefits all of our Azure customers,” she said.

According to Borkar, she was unaware of Maia’s performance in comparison to other options like Nvidia’s H100. Nvidia said on Monday that the H200 will go on sale in the second quarter of 2024.