Neural processing unit

A neural processing unit (NPU), also known as AI accelerator or deep learning processor, is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence (AI) and machine learning applications, including artificial neural networks and computer vision. Their purpose is either to efficiently execute already trained AI models (inference) or to train AI models. Their applications include algorithms for robotics, Internet of things, and data-intensive or sensor-driven tasks. They are often manycore designs and focus on low-precision arithmetic, novel dataflow architectures, or in-memory computing capability. As of 2024^[update], a typical AI integrated circuit chip contains tens of billions of MOSFETs.

AI accelerators are used in mobile devices such as Apple iPhones and Huawei cellphones, and personal computers such as Intel laptops, AMD laptops and Apple silicon Macs. Accelerators are used in cloud computing servers, including tensor processing units (TPU) in Google Cloud Platform and Trainium and Inferentia chips in Amazon Web Services. Many vendor-specific terms exist for devices in this category, and it is an emerging technology without a dominant design.

Graphics processing units designed by companies such as Nvidia and AMD often include AI-specific hardware, and are commonly used as AI accelerators, both for training and inference. All models of Intel Meteor Lake processors have a built-in versatile processor unit (VPU) for accelerating inference for computer vision and deep learning.

References

External links

Nvidia Puts The Accelerator To The Metal With Pascal.htm, The Next Platform
Eyeriss Project, MIT