The trend of AI big model driven customized chips is gradually emerging
2023-12-08
Recently, technology giants such as Amazon, Microsoft, Meta, and Google have increased their investment in self-developed chips, hoping to reduce their dependence on Nvidia. It is worth noting that driven by applications such as artificial intelligence and autonomous driving, most technology giants choose to customize chips to meet their own needs. The importance of customized chips is becoming increasingly prominent.

The trend of customizing AI chips is on the rise

Driven by the AI big model craze, Nvidia, a dominant company, is forcing more and more tech giants to personally manufacture AI chips. On November 28th, Amazon Cloud Technology (AWS) announced at the 2023 re: Invent Global Conference the launch of the second-generation AI chip Traineum2 specifically designed for training artificial intelligence systems, as well as the general-purpose Graviton4 processor. According to Adam Selipsky, CEO of Amazon Cloud Technology, the performance of Traineum2 is four times that of the first generation Traineum, and its energy efficiency is twice that of its predecessor. Equivalent to providing 650 teraflops (performing one trillion floating-point operations per second) of computing power per chip. A cluster of 100000 Training chips can train a large language model with 300 billion parameters in just a few weeks.

At the Ignite Developer Conference held on November 16th, Microsoft also announced the launch of two self-developed chips, Maia100 and Cobalt100. Maia 100 is used to accelerate AI computing tasks, helping artificial intelligence systems process and execute tasks such as recognizing speech and images more quickly; Cobalt 100 integrates 128 computing cores. Both chips are produced using TSMC's 5nm technology and are expected to be available for use in soft data centers early next year.

Except for Amazon and Microsoft, major clients of Nvidia such as Meta, Google, and Tesla are all investing more resources in the development of AI chips this year, and even OpenAI is starting to prepare for chip projects. As more and more enterprises enter the field of large models, the demand for high-end GPUs such as A100 and H100 has increased sharply, and the trend of technology giants investing in customized AI chips has also become increasingly fierce.

Pursuing chip performance and cost

The shortage of high-end GPUs is one of the reasons why technology giants are increasing their efforts to develop AI large model chips. As more and more enterprises enter the field of large models and more large models are released, the demand for high-end GPUs such as A100 and H100 in the market has increased sharply. OpenAI CEO Sam Altman has repeatedly complained about the shortage of computing power. According to previous reports from Barron Weekly, the delivery of Nvidia's high-end GPUs has been scheduled for 2024. In order to reduce dependence on NVIDIA GPUs, capable companies have increased their chip development efforts to create, train, and iterate large model products.


So, why are Amazon, Microsoft, and others all heading towards the path of independently developing custom chips? One of the primary reasons is that major manufacturers hope to optimize chip performance and seek differentiated solutions. Against the backdrop of the slowdown of Moore's Law, the previous approach of relying on Moore's Law to drive performance improvement has become increasingly difficult to sustain. To achieve optimal computing performance, it is necessary to rely on an architecture tailored to specific applications and datasets. Especially in the field of AI big models, different vendors have different differentiated needs, and more and more companies are finding that all-in-one solutions can no longer meet their computing needs.

Mohammed Awad, Senior Vice President and General Manager of Infrastructure Business Unit at Arm, stated that ultra large scale cloud service providers such as Alibaba, AWS, and Microsoft have all started developing their own chips, with the main goal of maximizing the performance and efficiency of each chip and achieving optimal optimization. They will personalize around servers, racks, and even their own data centers based on their own use cases and workloads. With the development of technologies such as GPTs, the amount of data and computation will only continue to increase. By customizing chips, manufacturers can optimize and support continuously increasing amounts of data and computation.

Reducing costs may also be a practical consideration for major giants. According to Bernstein analyst Stacy Rasgon's analysis, if ChatGPT's query size grows to one tenth of Google's search, it would initially require approximately $48 billion worth of GPUs and an additional $16 billion worth of chips annually to maintain operations. Faced with high operating costs, self-developed customized chips have become the unanimous choice of technology giants. Some analysts have stated that compared to using Nvidia's products, Microsoft's development of chips codenamed Athena for processing large models is expected to reduce the cost of each chip by one-third.

The future extends from the cloud to the edge

Mohamed Awad believes that in the future, more and more manufacturers will adopt customized chip solutions in the infrastructure field. Traditional server systems mostly adopt an architecture pattern where one CPU connects multiple accelerators through a standard bus. But in the era of AI, such an architecture is no longer able to meet the growing demand for data and computing, as it cannot obtain sufficient memory bandwidth. As a result, more and more model manufacturers are choosing to customize chips in order to flexibly adjust chip architectures and rebuild systems.


In fact, customized chips are not unfamiliar to major technology manufacturers. Amazon Cloud Technology began designing custom AI chips in 2018, launching its self-developed AI inference chip Inferentia, and in 2023, launching an iterative version of Inferentia 2, which will triple computing performance. Recently, Amazon Cloud Technology released the training chip Traineum2. The previous generation product Training was launched at the end of 2020. Google has an earlier history of custom chips. In 2020, Google actually deployed the artificial intelligence chip TPU v4 in its data center. At present, Google has transferred its engineering team responsible for AI chips to Google Cloud, aiming to enhance Google Cloud's ability to develop AI chips.

When it comes to the future development of the customized chip market, relevant experts point out that with the promotion of popular applications such as AI models and automobiles, the customized chip market will further expand. Currently, automotive manufacturers such as Tesla have invested in the development and commercialization of customized chips. In the future, customized chips will extend from cloud computing, HPC to edge computing. Although these applications can be processed through general-purpose chips, chips tailored to specific workloads can achieve performance or functional optimization at better cost and power efficiency, better meeting requirements. Experts also indicate that this trend is not very favorable for general chip manufacturers. But for other manufacturers in the IC industry chain, such as EDA manufacturers, IP manufacturers, and wafer foundries, it is a good thing.