What is a Teraflop? Understanding The New Graphics Buzzword

Trillion floating-point operations per second (TFLOPS) is a way to measure how powerful a computer system is. It essentially tells us how many calculations the system can do in one second using special math called floating-point arithmetic.

In a nutshell, it tells us how fast and capable a computer is at solving tough problems. Our guide covers everything about Teraflops to help you understand its intricacies and how this measure is being used in contemporary tech.

NVIDIA AD102 GPU can reach 100 teraflops | NVIDIA

Understanding FLOP

Before we delve deeper into the TFLOP, it’s important to understand what a “FLOP” is first.

Floating Point Operations Per Second (FLOP) is a key measure of a computer’s performance. It reflects how quickly a CPU or GPU can perform calculations that involve floating-point numbers – numbers that include a decimal point.-

What’s the formula to calculate FLOPS?

To calculate FLOPS, you can use the following formula:

FLOPS = Number of Cores × Clock Speed (in GHz) × Instructions Per Cycle (IPC)

In this formula, ‘number of cores‘ refers to the count of shader cores in a GPU (CUDA Cores for NVIDIA, Stream Processors for AMD) or cores in a CPU. ‘Clock speed‘ indicates the maximum speed at which these cores operate, and ‘IPC‘ stands for the number of operations each core can perform per cycle.

Higher FLOP count means Higher Performance?

Yes! A higher FLOP count suggests that the hardware can perform more operations per second, making it more powerful. In terms of gaming, a GPU with higher FLOPS will technically have better capabilities to deliver detailed and realistic graphics, enhancing the gaming experience. More on this later.

The fastest supercomputer in the world has over 1 billion TFLOPS | Our World in Data

This brings us to the question: What is a Teraflop? One teraflop is one trillion FLOPs.

Modern computers and equipment has gotten so fast that their performance is measured in Teraflops (TFLOPS) instead of FLOPS. After breaching the one TFLOP barrier in 2008, when AMD debuted the first TFLOP-capable graphics card, the industry has never taken a step back.

Modern graphics cards and gaming consoles are significantly more powerful, providing several times as many TFLOPS as those antiquated GPUs. For instance, the fastest mainstream GPU in the world currently, the RTX 4090, has over 83 TFLOPS of graphical prowess, a number that can actually be increased to 100 TFLOPS with some serious overclocking.

ASUS ROG STRIX RTX 4090 comes with 84 TFLOPS out the box | ASUS

Compare this to the latest MacBook Pro from Apple equipped with the M2 Max, which has a GPU capable of pushing up to 13.6 TFLOPS. On the console side of things, the PS5 has a GPU offering 10.3 TFLOPS, while the Xbox Series X boasts a 12-teraflop GPU. We’ll explore later how these figures affect real-world performance.

Infamously, it was Microsoft who first unleashed this word unto the mainstream in 2017 by saying their new console at the time, the Xbox One X, had 6 teraflops of graphical prowess which had everyone wondering what it even meant. In comparison, the PS4 Pro only had 4.2 TFLOPS.

Impact of TFLOPS on Performance

When it comes to performance in modern computing systems, the role of TFLOPS cannot be overstated. A system with a higher TFLOPS value theoretically possesses superior computational speed and graphical prowess, all other factors being equal. However, it is essential to remember that raw performance isn’t solely determined by TFLOPS.

In gaming, where graphics are a key concern, a higher TFLOPS value typically corresponds to an ability to handle more detailed and realistic graphics, thereby enhancing the gaming experience. This comes down to the simple fact that a higher TFLOPS count allows more floating-point calculations per second, leading to quicker rendering of complex images and graphics.

In scientific computing and artificial intelligence, high TFLOPS values can result in faster data processing and computations. Tasks that involve large data sets, such as machine learning training or complex simulations, benefit greatly from a high TFLOPS value, which allows them to process more data in less time.

The new AMD Z1 processors have a ridiculously high TFLOPS count for being mobile chips | AMD

However, the relationship between TFLOPS and performance isn’t always straightforward. Real-world performance is also influenced by other factors such as the core speed, the number of processors, frame buffers, and the software optimization (drivers). Thus, while a device with a higher TFLOPS value is theoretically faster and more capable, its real-world performance may not always align with its TFLOPS count.

A device’s performance can be constrained by other hardware components that aren’t operating at the same efficiency or level as the GPU or CPU, resulting in a bottleneck. For instance, a system with a high TFLOPS GPU might still underperform if it’s paired with an inadequate amount of RAM, a slow CPU, or if the software isn’t fully optimized to utilize the available computational power.

Therefore, while TFLOPS is indeed a valuable metric in gauging a system’s theoretical computational power, it should be viewed in conjunction with other performance indicators to provide a more complete picture of a system’s potential real-world performance.

Don’t Get Confused: Terabyte vs. Teraflop

Most tech-savvy folk know enough about storage to understand what a terabyte is, so much so that they might confuse the term with teraflop since they both sound so similar. While both terms are used in computer technology, they refer to distinct aspects.

Terabyte‘ measures storage capacity, representing how much data a device can store. On the other hand, ‘TFLOPS‘ measures computational power, indicating the number of trillion floating-point operations a computer system can perform per second. They are not the same and not related to each other.

A 1 terabyte MicroSD Card | Mashable

TFLOPS in Gaming

In gaming, TFLOPS can be important because they relate to the graphical fidelity and performance that the system can achieve. Modern games require significant computational power to render detailed and realistic graphics, simulate physics, handle artificial intelligence, and perform other complex calculations.

But once again, it’s important to keep in mind that teraflops don’t exclusively define the performance of a system. Let’s look at a comparison between the two major current-gen consoles, the PlayStation 5 and Xbox Series X to see how this is evident in real life.

PS5 vs. Xbox Series X Teraflops Debate

As discussed earlier, the PS5 is rated at 10.28 teraflops while the Xbox Series X can reach up to 12 teraflops of GPU compute performance. On paper, this is a pretty noticeable difference and one would expect it to translate into real-world discrepancies between games, but quite the opposite is actually the case.

The images attached below are from two different benchmarks conducted by Digital Foundry on Devil May Cry 5: SE, a cross-platform game. This was actually the first true next-gen title that took proper advantage of the high-end specs of both consoles.

The first image shows the game running in its “Normal” mode where the Series X has a slight lead in FPS. But in the second image showcasing the game running in “High Frame-Rate” mode, the PS5 takes the cake with a decent lead, despite having a weaker GPU with less TFLOPS.

Now, see the image below showing a more recent and much more demanding title “Elden Ring” running on both consoles. Once again, you see how the PS5 actually runs the game ever so slightly better here with more consistently higher FPS, proving that its 10.3 TFLOPS GPU (much like every other GPU) is not indicative of real-world performance.

Elden Ring showing a difference of 2 FPS between the two consoles in Ray-Tracing Mode | Digital Foundry

It’s important to keep in mind that the different modes on consoles and their nature of upscaling content almost all the time to give the illusion of a higher resolution obviously plays into the whole debate. So results may be different if settings were changed around a bit, which further highlights how one compute metric on its own is not enough.

Limitations of TFLOPS

This segues us nicely into how useful the teraflop term really is. Throughout the article we’ve clarified how TFLOPS do no represent actual performance, so now let’s delve deeper into the inherent issues that tell us why you shouldn’t look at teraflops as a definite figure.

1. Limited Scope:

Teraflops focus solely on floating-point operations, neglecting other important computational aspects such as integer operations or vector processing. This limited scope may not accurately represent a system’s full computational capabilities.

2. Memory Performance:

Teraflops overlook memory performance factors like bandwidth and latency, which are crucial for efficient data exchange. A high teraflop rating does not guarantee optimal memory performance, potentially impacting overall system efficiency.

3. Component Considerations:

Teraflops do not account for the speed and capabilities of other system components, such as the CPU, GPU, or storage. Unbalanced or bottlenecked components can hinder the system’s ability to fully leverage teraflop potential.

4. Thermals and Cooling:

Teraflops do not consider the impact of thermals and cooling on sustained performance. Inadequate cooling can lead to thermal throttling, limiting a system’s ability to reach its maximum performance potential.

5. Application-Specific Variations:

Different applications have distinct computational requirements. Teraflops provide a generalized metric, which may not align with specific application needs. Some applications may benefit significantly from higher teraflop counts, while others prioritize different computational aspects.

6. Real-World Performance:

Teraflops measure theoretical peak performance but may not reflect actual performance in real-world scenarios. Factors like software optimization, memory access patterns, system configuration, and user preferences can impact the achieved performance.

The Xbox Series X SoC powering the console | Microsoft

Other Applications of TFLOPS

TFLOPS is not just related to gaming, rather there are several other areas where this metric can be used.

  • Most importantly, with the arrival of Artificial Intelligence (AI), there has been a sharp rise in the use of the TFLOP metric. Since AI requires deep learning, neural network training, and inference, which require massive amounts of floating-point operations, TFLOPS are used to measure the performance and efficiency of hardware accelerators like GPUs and specialized AI processors.
  • Computational Fluid Dynamics (CFD) Simulations also require the use of TFLOPS. These simulations involve dividing the fluid domain into small cells or elements and solving a set of mathematical equations to simulate the fluid behavior. The calculations can involve millions or even billions of equations that need to be solved iteratively which is why TFLOPs are used.

Conclusion

Teraflops are a unit used to quantify computational power, relevant across various fields from gaming to Artificial Intelligence. They’re easily digestible figures which have made them become more of a marketing term in today’s tech world.

While they represent a theoretical maximum, real-world performance relies on several other factors, such as software optimization and system architecture. Therefore, teraflops should not be viewed as the ultimate measure of a computing system’s capability.

If you liked the comprehensive nature of this guide, be sure to check out our breakdown of ray-tracing versus traditional rasterization.

ABOUT THE AUTHOR

Huzaifa Haroon


Born and raised around computers, Huzaifa is an avid gamer and a keyboard enthusiast. When he's not solving the mysteries of technology, you can find him scrutinizing writers, striving to inform the curious.