Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Because there are 4 cores, the throughput increases linearly up to 4 threads. Then it increases slightly more up to 8 threads, thanks to hyper-threading that squeezes a bit more performance out of each core. Obviously there is no performance improvement beyond 8 threads, because all CPU resources are saturated at this point. [emphasis mine]

Can someone more knowledgeable than I please explain the obvious part of this? Why is the performance identical for 8-12 threads? What is it that is saturated at 8 threads even though there are 4 more threads hanging around?



Note that it is a 4-core CPU which employs simultaneous multithreading[1] (hyper-threading in this case) to achieve a higher throughput while executing independent instruction streams. This technique can provide multiple hardware threads to the OS to schedule its processes on (typically 2 per physical core), and the CPU takes care of executing them as concurrently as possible on a single core (typically by employing a few hardware enhancements within the core so that it can work with multiple instruction streams in a since clock cycle).

At first, the throughput up to 4 cores increases linearly because 4 OS threads can utilize 4 hardware threads independently, making the greatest possible use of available hardware resources. Beyond 4 OS threads, simultaneous multithreading comes into play and up to 8 OS threads get scheduled on 8 "simultaneously multithreaded" hardware threads, which offers increased throughput, but not as drastic (see how the author uses the word "slightly"). Beyond 8 OS threads, the throughput will not increase as you are out of hardware threads which actually execute your instruction streams independently. The OS can spawn an arbitrary number of threads beyond 8, but they will take turns executing on your 8 available hardware threads - no gain in net throughput. You get limited by the hardware.

[1] https://en.wikipedia.org/wiki/Simultaneous_multithreading


The simplest explanation I've heard is that hyperthreading is like using two hands to keep the mouth always full(whereas with only one sometimes it remains idle). Once you can keep it always full, there's no use adding more hands.


POWER has 8-way SMT. More than 2-way SMT is definitely useful in some cases, but it does not exist on any x86 chip, so it doesn't matter if you try to use more threads in software. The hardware can only handle 2 per core.


The CPUs can only run two threads per core at a time. Therefore, it doesn't matter if you make more software threads; only 8 will actually run at a time.

All CPU resources aren't actually saturated, as there will still be idle execution units, but since the CPU can't actually dispatch another thread to make use of those units, there's nothing you can do about that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: