AMD Zen 3 Architectural Improvements: Explained
On October 8th, 2020 AMD announced it’s brand new Ryzen 5000 series desktop processors based on the Zen 3 architecture. This announcement was one of the most anticipated PC hardware announcements of the year. Ever since the launch of the original Zen architecture in 2017, AMD has been on a steep upward trajectory in terms of annual architectural improvements. This year was no different, with AMD claiming to offer the biggest generational leap in the history of Ryzen processors. What makes this new architecture so special? Let’s take a deep dive into the architectural improvements that are brought by Zen 3.
The Basics of Zen architecture
AMD’s Ryzen processors use a unique design which is very different from what their main competitor Intel uses in their desktop processors. Ryzen processors are actually based on multiple small chiplets, rather than a large singular chip. These different chiplets communicate with each other via a connection known as the “Infinity Fabric”. AMD describes the Infinity fabric as a superset of hyper-transport which allows for fast connectivity between different chiplets in AMD processors. This means that rather than a single chip, there are multiple small chiplets on the substrate which communicate with each other via a fast link.
This design comes with its pros and cons. The biggest advantage is scalability. A chiplet design means that AMD can pack more cores into a smaller package, thus allowing for high core count options in even the budget segment of the CPU market. The main disadvantage of this design is latency. The cores are physically separated from each other which introduces a bit more latency due to the time taken for data to travel across the infinity fabric. This means that performance in latency-sensitive applications like gaming is usually lower than Intel’s single-chip design.
Zen 2 Implementation
The Ryzen 3000 series processors were a massive success in the mainstream desktop market. These CPUs were based on the Zen 2 architecture built on TSMC’s 7nm process, which had some very interesting improvements in the design of the Zen architecture. Zen 2 combined the CPU cores into Core Complexes of 4 each, while also dividing the pool of 32MB L3 Cache into two smaller pools of 16MB cache each. These core complexes (CCX) were the basis of the Zen 2 lineup of processors. Each 4-core complex had immediate access to the 16MB of L3 cache which was important to improve latency. This meant that Zen 2 was very competitive to Intel in latency-sensitive applications like gaming, while heavily outperforming Intel in multithreaded workloads.
The different CCX units still had to be interconnected via the Infinity Fabric, so some latency was still to be expected. Nevertheless, Zen 2 offered a 15% IPC (Instructions Per Clock) improvement over Zen+ and also boasted higher core clocks. This generation was important for AMD as now they have clawed back their way into the competition with Intel, and have huge potential for improvement due to their rapid innovation and Intel’s complacency.
Targets for Zen 3
AMD set out to develop Zen 3 with a very clear goal in mind. As they already dominate the multithreaded side of the competition, the only area where they still lag a little behind Intel is gaming. As good as Zen 3 was, it could not steal the gaming crown off of Intel due to the blue team’s design which offers extremely high clock speeds and low latency. For pure gamers who want the highest possible framerate, the answer was still Intel. Therefore, AMD’s targets for this generation were clear:
- Improve Core-to-Core Latency
- Increase Core Clock Speeds
- Increase Instructions-per-clock (IPC)
- Increase Efficiency (Higher Performance per Watt)
- Increase Single-threaded Performance
Considering that Zen 2 was already a very solid performer in multi-core applications, it was easy for AMD to focus almost exclusively on the single-threaded performance for this generation of CPUs.
Zen 3 Improvements
AMD talked about their new CPUs and the Zen 3 architecture in their “Where Gaming Begins” Live stream on October 8th. AMD claims that Zen 3 is the biggest generational leap in the history of the Zen architecture. The new Ryzen 5000 CPUs are still based on TSMC’s 7nm process, but boast a good number of architectural improvements under the hood.
8-Core Complex Design
Arguably the biggest improvement with the new architecture was the all-new layout. AMD has done away with the multiple-CCX design of Zen 2 and has instead gone with a single 8-core Complex design in which all the 8 cores have access to the entire 32MB of L3 cache. This redesign has huge implications in latency-sensitive applications like games.
With every core in direct contact with the cache and the other cores, it improves the latency significantly because the data does not have the cross the entire die to get from one side to the other. This redesign also improves the effective memory latency of the chip, resulting in increased performance for single-threaded tasks.
The improved layout of the core complex is not the only improvement that Zen 3 brings. AMD claims a 19% IPC Improvement over Zen 2 which is a huge figure. IPC or Instructions Per Clock is indicative of how much work the CPU can do per clock cycle. The 19% improvement is the biggest jump we have seen in IPC ever since Ryzen first launched in 2017. The previous generation of Zen 2 processors also brought a pretty massive 15% IPC improvement over the Zen+ architecture.
This IPC improvement means that AMD can compete with Intel’s sky-high core clocks by even staying below 5 GHz in terms of boost clocks. AMD has also outlined the contributors to this massive IPC increase. According to the promotional material, the main contributing factors are:
- Cache Prefetching
- Execution Engine
- Branch Predictor
- Micro-op Cache
- Front End
Due to the incredible density of TSMC’s 7nm process, AMD was able to cram even more power into the Ryzen chips while maintaining the same average power draw. AMD claims that the Ryzen 5000 series chips are built upon the same 7nm process as the 3000 series however the process has been refined and the resulting chips are thus more efficient.
AMD has also made a bold claim that the Ryzen 9 5900X and 5950X will consume the same amount of power as the last-gen 3900X and 3950X respectively, despite having higher boost clocks and the improved IPC. AMD’s promotional material quoted a “2.4X Performance per Watt” improvement over the original Zen architecture. This number lines up with AMD’s claims of the power draw of 5900X and 5950X since they now have higher clocks but still have the same TDP numbers as their predecessors.
Refined Silicon, Higher clocks
At the tail-end of the lifetime of the Ryzen 3000 series, AMD released a refresh that added 3 CPUs to the series with the “XT” branding. The Ryzen 5 3600XT, Ryzen 7 3800XT, and Ryzen 9 3900XT were the exact same CPUs as the base models but with higher clock speeds. During the end of a product’s lifespan, the manufacturing process becomes mature and the silicon quality becomes better. This means the silicon produces CPUs that can boost higher and hold the clocks for longer. This is exactly how the XT lineup of CPUs became possible.
With Zen 3 CPUs, AMD used that same mature manufacturing process and the higher quality silicon to build the 5000 series CPUs on the same 7nm node. This allowed AMD to push the boost clocks much higher than even the XT series of the last generation. Higher boost clocks, coupled with higher IPC and a redesign of the core layout meant that AMD was ready to tackle the challenge of single-threaded performance. The advertised clock speeds of the 4 Ryzen 5000 series processors are as follows:
- AMD Ryzen 5 5600X: 3.7 GHz Base, 4.6 GHz Boost
- AMD Ryzen 7 5800X: 3.8 GHz Base, 4.7 GHz Boost
- AMD Ryzen 9 5900X: 3.7 GHz Base, 4.8 GHz Boost
- AMD Ryzen 9 5950X: 3.4 GHz Base, 4.9 GHz Boost
Chiplet Design Advantages
There were many factors that made it possible for AMD to make such a substantial inter-generational leap. One of the biggest ones is the design of the chips itself, namely the “Chiplet Style” layout of the CPU dies. This design offers many key advantages when it comes to generational improvements:
- Scalability: Due to the fact that cores are arranged inside the chiplets on the substrate, it is possible for AMD to cram more cores into a similar package without the risk of overheating. Intel’s competing design places all the cores very close to each other which can have drastic thermal issues if not configured properly. AMD on the other hand has been successful in using this chiplet design to make 6-core, 8-core, 12-core, and even 16-core processors on the mainstream desktop platform. This means that AMD has established a core-count dominance due to this design.
- Ease of Development: Another big advantage of this design is apparently its ease of development. During the development process of the Zen 3 architecture, AMD used the exact same base design as Zen 2 and then modified it. This meant that the design was already perfected to a certain degree, and it was easy for AMD to improve in the key areas that they were targeting.
- Concurrent 5nm Development: AMD also pointed out that their future plans for Ryzen CPUs based on the 5nm architecture were also on track. This is because the chiplet design architecture allows AMD to run multiple development streams concurrently. AMD was confident that their 5nm process would arrive just as planned, just like the Zen 3 and Zen 2 architectures based on the 7nm process did.
Zen 3 based Ryzen 5000 series processors promise to be the industry leaders not only in multithreaded workloads but also in gaming. For the first time since 2006, AMD has officially dethroned Intel in the race for the absolute best gaming performance (according to AMD’s claims). AMD has also claimed to have the highest single-threaded performance of any desktop chip with the Ryzen 9 5950X, followed closely by the Ryzen 9 5900X. Let’s have a look at the expected results from the architectural improvements brought by Zen 3.
Leadership in Gaming
With a whopping 19% IPC improvement, increased core clocks, and a redesigned core complex system, AMD has made a gigantic leap in gaming performance this generation. While Zen 2 was reasonably competitive with Intel’s offerings, Zen 3 plans to outright beat Intel in all gaming workloads. AMD claims that the Ryzen 9 5900X is on average about 26% faster than the Ryzen 9 3900X in games. This is a gigantic leap to be made in just one generation.
Moreover, AMD has also claimed that the Ryzen 9 5900X is faster than the Core i9-10900K in gaming. This is pretty huge news for AMD fans who and for general PC enthusiasts. This now means that the top AMD CPUs beat the top Intel CPUs in both gaming and multi-core applications. It doesn’t help Intel’s case that they are still stuck on the archaic 14nm architecture and their next-gen Rocket-Lake processors are also rumored to be on 14nm. Meanwhile, AMD is firing on all cylinders with their 7nm offerings in Zen 2 and Zen 3, while also concurrently working on the 5nm plans which are apparently on-track as well. This can have serious implications for Intel’s desktop CPU market share.
Improved Single-Threaded Performance
AMD has had better multicore performance for a while now, but that does not necessarily translate into better gaming performance due to the fact that modern games do not make effective use of all the cores. Many games have a dominant thread, often called the “world thread”, which is most heavily utilized. The world thread is massively sensitive to latency, and single-core performance. Thanks to AMD’s architectural redesign, the latency has been massively reduced thus improving the performance of this dominant thread massively. This has enabled AMD to take the lead in gaming scenarios.
This also means that AMD’s single-threaded performance is now vastly superior to Intel’s. In fact, AMD showed off an impressive single-core Cinebench score of 640 for the Ryzen 9 5950X which was closely followed by the score of 631 by the Ryzen 9 5900X. These improvements are also possible due to the architectural core complex redesign, reduced latency, and higher boost clocks of the Zen 3 architecture. Read more about the single-threaded performance of the Ryzen 5000 series processors in this article.
Even higher Multi-threaded Performance
Continuing its dominance over the multi-threaded performance segment, AMD showed off impressive numbers again for its Zen 3 based Ryzen 5000 series processors. In particular, the 12-core Ryzen 9 5900X and Ryzen 9 5950X have unrivaled performance in core-heavy workloads. AMD also made some tweaks under the hood, which allowed the 5950X to be the fastest desktop processor for CAD work as well, for the first time. AMD deemed it the best gaming processor AND the best processor for content creation, and it is hard to argue with that statement. AMD claimed an impressive 12% more performance in rendering workloads over the 3950X. This makes this processor an absolute beast for those who strive for the very best that desktop computing has to offer.
Alarm bells for Intel?
There is no doubt that AMD has been improving their Ryzen lineup of processors at an almost blinding rate. They have offered huge performance improvements from generation to generation and Zen 3 promises to be their biggest jump yet. While the Ryzen 3000 series processors offered excellent value in terms of core-counts and pricing, they were still behind Intel in one main workload: Gaming. AMD had established a strong lead in almost all other aspects of the desktop market be it rendering, encoding, video production, or streaming, but they needed to overtake Intel in gaming to be truly the undisputed best-in-class processor.
Thanks to the amazing architectural design of the Ryzen processors, TSMC’s 7nm process, and the brilliant planning and execution by the AMD development team, they have finally done it with Zen 3. This launch must be ringing alarm bells at the Intel headquarters. Intel is a huge company and there is no way that they would not respond to this, but they have certainly lagged behind AMD when it comes to the speed of development. The main hurdle that Intel has to clear is its aged 14nm process which it has been using ever since Skylake.
Intel has had well-documented problems with its 10nm process and therefore they are not able to roll out desktop chips based on that architecture yet. However, the tides may be changing soon as Intel has successfully released their recent laptop CPUs codenamed “Tiger Lake” which are based on the 10nm architecture. These laptop chips offer big improvements in both performance and efficiency over the last generation, and it is plausible that Intel may be working to port this process over to the desktop CPUs. Should Intel manage to get their 10nm process functional, the coming years are going to be very interesting for CPU performance enthusiasts.