NVIDIA Hopper H100 GPU Revealed at Hot Chips, Up to 30x Faster Than the A100 in AI Workloads
A few days ago, NVIDIA teased its upcoming Hopper/Grace GPU and CPU powered superchip. More information was planned for the Hot Chips event which is currently underway.
What is NVIDIA’s Grace Hopper?
The Grace Hopper can be thought of as a Superchip featuring two chips on one motherboard. One for NVIDIA’s Hopper GPU and the other for NVIDIA’s Grace CPU. They use NVIDIA’s signature NVLink-C2C technology to deliver exceptional levels of AI accelerated performance.
NVIDIA’s Hopper based H100 reportedly uses a monolithic design meaning you wont see multiple chiplets. The MCM (Multi-Chip Module) design is being utilized by AMD for their HPC GPUs. NVIDIA’s H100 uses TSMC’s 4n process node, which is a refresh of its 5nm process.
A Slight Overview
The H100 ships with 132 SMs offering a 2x performance boost per clock. These GPUs make use of the 4th gen NVLink technology allowing for a total bandwdith of 900GB/s. The new Hopper SM architecture promises a 2x increase in FP32 and FP64 performance along with newer 4th gen based Tensor cores for enhanced AI capabilities.
HBM for High Bandwidth Memory
The A100 from Ampere used the HBM2 memory architecture. For Hopper, NVIDIA had to push through. The new HBM3 based memory from NVIDIA marks its arrival with the launch of Hopper. This major leap allows for a 2x increase in DRAM bandwidth.
Divide your GPU’s Power Across Various Users
NVIDIA’s MIG (Multi-Instanced GPU) technology was introduced back with Ampere. What this does is it divides your GPU’s compute performance among various CUDA applications enabling maximum parallel performance. This technically allows for mutliple users/applications to use the same GPU efficiently.
Hopper enhances this technology and promises 3x more compute capacity and twice the memory bandwidth. Moreover, for security purposes an additional layer of security is now being provided at the hardware level. This splits up the memory allocation for each tenant (or instances) disallowing access to other instances.
Massive Performance Improvements
As applications get more and more intense requiring heaps upon heaps of computational power, a memory bottleneck is often faced. To eliminate this, NVIDIA introduced their NVLink which drastically increases GPU to GPU bandwidth.
The H100 from Hopper surpasses the A100 (Last gen) in nearly all tasks thrown at it. With the use of NVLink, a performance increase of over 3x can be seen. Similarly, the additional microarchitectural improvements in relation to AI give Hopper a boost of almost 30x as shown below.
4th Gen Tensor Cores
AI is the talk of everyday now. Hopper uses the 4th generation of NVIDIA’s Tensor cores. The H100 brings forth the new FP8 format while boosting performance in all other formats by 2x.
Improvements Over A Decade
Team green made a rather interesting comparison. Back in 2012, the Kepler GK110 was a powerhouse that was miles ahead of all the competition. Fast forward to 2022, the performance of the GK110 is packed into one of the many GPCs featured on the H100. That’s impressive!
NVIDIA’s Grace CPUs and Hopper GPUs are ready for launch sometime in Q1/Q2 2023. The Grace CPUs are more pertained towards high performance computing, whereas the Hopper GPU is targeted for AI training, HPC.