Company news--Karmin Electronic (HK) Limited

Intel’s new Gaudi2 processor has been launched in the Chinese market, accelerating large-scale deep learning training and reasoning
2023-07-12

Today, the Intel AI Product Strategy and Gaudi2 New Product Launch Conference was held in Beijing. At the meeting, Intel officially launched the second generation Gaudi deep learning accelerator - Habana in the Chinese market ® Gaudi ® 2. As an important component of Intel's cloud to end product portfolio, Gaudi2 is committed to accelerating AI training and reasoning with leading cost-effectiveness advantages, providing Chinese users with higher deep learning performance and efficiency, thus becoming a better solution for large-scale deployment of AI.

Intel Launches Gaudi2 Deep Learning Accelerator in the Chinese Market

Sandra Rivera, Executive Vice President and General Manager of the Data Center and Artificial Intelligence Division at Intel, pointed out that, Intel is committed to accelerating the development of artificial intelligence technology by providing customers with a wide range of hardware choices and supporting open software environments. With a product portfolio including Xeon scalable processors and Gaudi2 deep learning accelerators, Intel is lowering the entry threshold for artificial intelligence and strengthening customers' ability to deploy this critical business technology through networks and intelligent edges in the cloud, thereby helping to build artificial intelligence in China The future of intelligence

New Gaudi2 Training Accelerator for Deep Learning

The Gaudi2 Deep Learning Accelerator and Gaudi2 Mezzanine Card HL-225B is based on the first generation Gaudi high-performance architecture, with multi-dimensional performance and energy efficiency improvement to accelerate the operation of high-performance large language models. This accelerator has:

• 24 programmable Tensor processor cores (TPCs)

• 21 Gbps (RoCEv2) Ethernet interfaces

• 96GB HBM2E memory capacity

• Total Memory bandwidth of 2.4TB/s

• 48MB on-chip SRAM

• Integrated multimedia processing engine

Habana ® Gaudi ® 2 Deep learning accelerator

The outstanding performance of Gaudi2 accelerator was announced in MLCommons in June ® MLPerf ® The benchmark test was fully certified, and it achieved excellent training results on GPT-3 model, computer vision model ResNet-50 (using 8 accelerators), Unet3D (using 8 accelerators), and Natural language processing model BERT (using 8 and 64 accelerators). Compared to other products in the market targeting large-scale generative AI and large language models, Gaudi2 has excellent performance and leading cost-effectiveness advantages, which can help users improve operational efficiency while reducing operational costs.

In addition, Gaudi2 can provide excellent inference performance for large-scale multimodal and linguistic models. In the recent Hugging Face evaluation, its performance in large-scale reasoning, including running Stable Diffusion (another state-of-the-art generative AI model used to generate images from text), the 7 billion and 176 billion parameter BLOOMz models, has maintained a leading position in the industry.

Meeting the needs of large languages and multimodal models

The architecture of Gaudi2 deep learning accelerator aims to efficiently expand to meet the needs of large-scale language models and generative AI models. Each chip integrates 21 dedicated 100Gbps (RoCEv2 RDMA) Ethernet interfaces for internal interconnection, enabling low latency server scalability.

On Stable Diffusion training, Gaudi2 demonstrated nearly linear 99% scalability from 1 card to 64 cards. In addition, the MLPerf training 3.0 results just released by MLCommons have also verified that the Gaudi2 processor can achieve impressive nearly linear 95% scalability from 256 accelerators to 384 accelerators on a GPT-3 model with 175 billion parameters.

With mature software support, Gaudi2 products are officially launched in the Chinese market

With the increasing demand for generative AI and large language models, Intel is also committed to building leading and mature software support, fully unleashing the performance of the Gaudi2 deep learning accelerator.

To support customers in easily building models or migrating their current GPU based model business and systems to the new Gaudi2 server, and to help protect software development investment, SynapseAI ® The software suite has been optimized for the Gaudi platform's deep learning business, aiming to work with a wide range of software ecosystems to help simplify model development and migration. SynapseAI integrates support for TensorFlow and PyTorch frameworks, and provides numerous popular computer vision and natural language reference models to meet the diverse needs of deep learning developers.

Currently, Intel is collaborating with Inspur Information to create and release the Inspur Information AI server NF5698G7 based on the Gaudi2 deep learning accelerator. This server integrates 8 Gaudi2 accelerator cards HL-225B and also includes dual fourth generation Intel Xeon scalable processors.

Inspur NF5698G7 Server Based on Gaudi2 Accelerator

Join hands with China's industrial ecology to jointly launch a new chapter in artificial intelligence

For many years, with a strong AI software and hardware foundation, Intel has been committed to providing industry-leading and excellent performance for various workloads in the AI field. Through an open ecosystem and diverse product selection, Intel has continuously lowered the threshold for AI deployment and provided firm support for the development of AI in China.

At this press conference, ecological partners such as Meituan, Baidu, and Inspur Information shared their progress in diversified intelligent business based on Intel's software and hardware product portfolio. He Yongzhan, senior manager of Baidu AI Cloud Server, said that integrating Intel ® Fourth generation Intel for AMX acceleration engine ® xeon ® Scalable processors have brought multiple performance optimizations to the ERNIE Tiny model. Baidu will continue to build leading AI full stack capabilities and a comprehensive open ecosystem, and looks forward to broader and deeper cooperation with Intel in the future AI field. Wang Lei, Senior Product Manager of Inspur Information, emphasized that NF5698G7 is a new generation AI server specifically developed for the generative AI market innovation. It supports 8 Gaudi2 accelerators with high-speed OAM interconnection, and has the advantages of high performance, high scalability, high energy efficiency, and open ecology. It will provide AI customers with powerful large model training and reasoning capabilities. In the future, Inspur Information will continue to work with Intel to create innovative and leading product solutions for the industry.

In addition, multiple local ecological partners have also expressed their firm stance and long-term outlook on current and future product cooperation with Intel. Liu Hongcheng, Vice President of the Computing and Storage Product Line of Xinhua Three Group, pointed out that Xinhua Three Smart Computing adheres to the technological concept of endogenous intelligence and is based on comprehensive capabilities such as hardware enablement, forward-looking technology, green and low-carbon, to assist in the rapid development of the AI industry. Based on the Intel Gaudi2 AI accelerator, Xinhua Sanzheng collaborates closely with Intel to develop high-performance AI servers suitable for large model training and inference, promoting inclusive innovation in intelligent computing power. At the same time, Tang Qiming, President of Computing Infrastructure at Hyperfusion Digital Technology Co., Ltd., stated that he is honored to witness the release of Intel Gaudi2. As a long-term strategic partner of Intel, Hyperfusion will continue to work with Intel to jointly launch new products and solutions based on Gaudi2, helping enterprises achieve mature commercial and large-scale AI scenarios.

In the future, Intel will continue to lead the development of product technology, further accelerate large-scale deep learning deployment, and assist in the development of the local AI market in China.

Prev：Armored Xia Launches Second Generation UFS 4.0 Embedded Flash Device Next：Microchip Launches First Batch of Vehicle Grade 10BASE-T1S Ethernet Devices