IBM Unveils Worlds First Quad-Core AI Accelerator Chip
08-03-2021 | By Robin Mitchell
Recently, IBM announced the development of the world’s first quad-core AI accelerator chip. What advantages do AI accelerators provide, what details surround IBMs, new device, and how will such devices shape the AI industry?
What advantages do AI accelerator devices provide?
An AI accelerator is a device whose purpose is to accelerate AI applications, and as such the advantage they provide is that they improve AI tasks. But to understand why they provide an advantage we first need to understand the requirements of AI.
Most AI algorithms take advantage of a programming structure called a neural net, and neural nets operate similarly to biological neurons. A neural net consists of nodes that perform basic arithmetic steps, and these nodes are connected to other nodes which take the output of connected nodes, process the data, and send the data to the next node. The connections between nodes are weighted meaning that the connection itself adjusts the result from one node to another.
Individual nodes only use rudimentary mathematics, and as such is trivial for a CPU to solve. However, AI systems are only practical when using many thousands, if not millions, of nodes. With many parallel tasks taking place simultaneously and the extremely large amounts of fast memory needed, classical computers are typically inadequate for running AI algorithms.
Graphics Processing Units, unlike CPUs, consist of many thousands of basic arithmetic processors working in parallel. As such, GPUs have been the go-to hardware solution for running AI algorithms. But even then, GPUs are still not optimised for the task of running AI algorithms, and GPUs are exceptionally power-hungry even for embedded devices.
HOWEVER, an AI accelerator has its hardware configured for the sole purpose of running AI algorithms (also known as inferences) and training AI algorithms. Such accelerators incorporate simplistic processing units for basic operations but parallel these processes in mass all with access to memory.
IBM Unveils World’s First Quad-Core AI Accelerator
Recently, IBM has revealed its development of the world’s first quad-core AI accelerator using 7nm MOSFET technology. The new chip developed by IBM uses ultra-low precision hybrid 8-bit floating-point arithmetic units for use during training and 4-bits during inference (i.e. executing the AI).
In typical computing systems, increasing the bit-width allows for increased computational capabilities and access to more memory. However, AI is unusually tolerant of low-precision and the greater the bit-width of a system, the more memory it requires. As such, developers of AI systems often try to reduce the bit-width as much as they can, and the new quad-core AI accelerator by IBM has this down to 4-bits of inference precision.
The resulting quad-core AI accelerator achieves more than 80% utilisation during training (i.e. how much of the processor is being used), and more than 60% during inference which is significantly better than GPU performances which generally fall below 30%. Furthermore, IBM has also integrated a unique power management system that reduces the accelerator's power consumption by slowing the clock frequency during high computational tasks.
How will AI accelerators change the industry?
Running AI tasks on traditional processors has proven to be power-hungry, memory demanding, and overall extremely inefficient. However, AI applications in everyday applications are proving to be incredibly advantageous with all areas of the industry being affected.
While fixed computing systems can continue to afford to run AI algorithms on high-performance CPUs and GPUs, the embedded world typically has to rely on the use of cloud-based computing to execute AI algorithms. The result of such remote computing is increasing concerns in privacy, additional pressure on internet infrastructure, and long delays between request and response.
AI accelerators such as the one developed by IBM would not only allow embedded applications to run AI locally but would be able to do so efficiently. Furthermore, running AI locally improves privacy concerns by keeping potentially sensitive information local to the device and decreases latency. The use of AI accelerators will also lead to real-time AI responses in low-powered devices, something which is not currently possible.
Read More