NVIDIA Discloses New Details Regarding its Hopper GPUs and Grace CPUs

NVIDIA plans to reveal more information regarding its Grace CPUs in the upcoming Hot Chips event scheduled for next week. However, NVIDIA gave us a few ‘teasers’ if you may beforehand. Expect a much more detailed overview next week, although, the information we have as of now will suffice.

What is NVIDIA’s Grace Hopper?

The Grace Hopper can be thought of as a Superchip featuring two chips on one motherboard. One for NVIDIA’s Hopper GPU and the other for NVIDIA’s Grace CPU. They use NVIDIA’s signature NVLink-C2C technology to deliver exceptional levels of AI accelerated performance. 

When I mention CPU+GPU, it is to be noted that both are made by NVIDIA. In a way you can say that NVIDIA has finally entered into the CPU market. NVIDIA’s Grace CPU features 144 Arm v9 cores and 1 TB/s of memory bandwidth.

The GPU features NVIDIA’s upcoming Hopper architecture (parallel to Lovelace for consumers). NVIDIA’s Hopper GPUs feature 80 billion transistors using the cutting edge TSMC 4N process. Importantly, the 4N process falls under the umbrella of TSMC’s 5nm process, so it may be a refresh edition (improved). 

Grace CPU Architecture

NVIDIA’s new Scalable Coherency Fabric (SCF) mesh interconnect allows for a massive bandwdith of 3.2TB/s across various Grace chip units. This mesh is scalable for up to 72+ cores where each CPU has 117MB of L3 Cache.  

Diagram Showcasing NVIDIA’s Grace CPU | NVIDIA

Another diagram gives us much more information. Every CPU supports up to 68 PCIe Gen 5.0 Lanes (12+56) and 4 PCIe 5.0X16 conections. In addition to that, 16 LPDDR5x Memory Controllers (MC) can also be found.

Another Diagram Featuring NVIDIA’s Grace CPU

Information Regarding NVLink-C2C

Most GPUs process data now faster than ever, however, bandwidth and the time taken to transfer still remains a bottleneck. To counter this, NVIDIA created a custom CPU and GPU Superchip thus eliminating any such problems and maximizing bandwdith.

This interface provides a bandwidth of around 900GB/s which is 7x more than a PCIe 5.0 x 16 interface. Efficiency wise, NVLink-C2C uses just 1.3 pJ/bit. That’s much more efficient than a PCIe 5.0 interace (Up to 5x).

InterconnectPicojoules per Bit (pJ/b)
NVLink-C2C1.3 pJ/b
UCIe0.5 – 0.25 pJ/b
Infinity Fabric~1.5 pJ/b
TSMC CoWoS0.56 pJ/b
Foveros0.2 pJ/b
EMIB0.3 pJ/b
Bunch of Wires (BoW)0.7 to 0.5 pJ/b
On-die0.1 pJ/b

Release Date

NVIDIA’s Grace CPUs and Hopper GPUs are ready for launch sometime in Q1/Q2 2023. The Grace CPUs are more pertained towards high performance computing, whereas the Hopper GPU is targeted for AI training, HPC. 

Abdullah Faisal
With a love for computers since the age of give, Abdullah has always sought to delve into the depths of information, and uses it as his guiding light. He believes success is of utmost importance as history is written by the victor.