[英]Single thread programme apparently using multiple core

Question summary: all four cores used when running a single threaded programme. Why?


Details: I have written a non-parallelised programme in Xcode (C++). I was in the process of parallelising it, and wanted to see whether what I was doing was actually resulting in more cores being used. To that end I used Instruments to look at the core usage. To my surprise, while my application is single threaded, all four cores were being utilised.

細節:我在Xcode (c++)中編寫了一個非並行程序。我在並行化它的過程中,想看看我所做的是否真的會導致更多的內核被使用。為此,我使用了一些工具來研究核心用法。令我驚訝的是,雖然我的應用程序是單線程的,但所有四個核心都被使用了。

To test whether it changed the performance, I dialled down the number of cores available to 1 (you can do it in Instruments, preferences) and the speed wasn't reduced at all. So (as I knew) the programme isn't parallelised in any way.


I can't find any information on what it means to use multiple cores to perform single threaded tasks. Am I reading the Instruments output wrong? Or is the single-threaded process being shunted between different cores for some reason (like changing lanes on a road instead of driving in two lanes at once - i.e. actual parallelisation)?


Thanks for any insight anyone can give on this.


EDIT with MWE (apologies for not doing this initially). The following is C++ code that finds primes under 500,000, compiled in Xcode.


#include <iostream>

int main(int argc, const char * argv[]) {
    clock_t start, end;
    double runTime;
    start = clock();
    int i, num = 1, primes = 0;
    int num_max = 500000;

    while (num <= num_max) {
        i = 2;
        while (i <= num) {
            if(num % i == 0)
        if (i == num){
            std::cout << "Prime: " << num << std::endl;


    end = clock();
    runTime = (end - start) / (double) CLOCKS_PER_SEC;
    std::cout << "This machine calculated all " << primes << " under " << num_max << " in " << runTime << " seconds." << std::endl;

    return 0;

This runs in 36s or thereabouts on my machine, as shown by the final out and my phone's stopwatch. When I profile it (using instruments launched from within Xcode) it gives a run-time of around 28s. The following image shows the core usage.


instruments showing core usage with all 4 cores (with hyper threading)


Now I reduce number of available cores to 1. Re-running from within the profiler (pressing the record button), it says a run-time of 29s; a picture is shown below.


instruments output with only 1 core available


That would accord with my theory that more cores doesn't improve performance for a single thread programme! Unfortunately, when I actually time the programme with my phone, the above took about 1 minute 30s, so there is a meaningful performance gain from having all cores switched on.


One thing that is really puzzling me, is that, if you leave the number of cores at 1, go back to Xcode and run the program, it again says it takes about 33s, but my phone says it takes 1 minute 50s. So changing the cores is doing something to the internal clock (perhaps).


Hopefully that describes the problem fully. I'm running on a 2015 15 inch MBP, with 2.2GHz i7 quad core processor. Xcode 7.3.1

希望這能充分說明問題。我正在運行一個2015年15英寸的MBP,有2.2GHz i7 quad核心處理器。Xcode 7.3.1

2 个解决方案



I want to premise your answer lacks a lots of information in order to proceed an accurate diagnostic. Anyway I'll try to explain you the most common reason IHMO, supposing you application doesn't use 3-rd part component which perform in a multi-thread way.


I think that could be a result of scheduler effect. I'm going to explain what I mean.


Each core of the processor takes a process in the system and executed it for a "short" amount of time. This is the most common solution in desktop operative system.


Your process is executed on a single core for this amount of time and then stopped in order to allow other process to continue. When your same process is resumed it could be executed in another core (always one core, but a different one). So a poor precise task manager with a low resolution time could register the utilization of all cores, even if it does not.


In order to verify whether the cause could be that, I suggest you to see the amount of CPU % used in the time your application is running. Indeed in case of a single thread application the CPU should be about 1/#numberCore , in your case 25%.

為了驗證原因是否可能,我建議您查看應用程序運行時使用的CPU %的數量。實際上,在單個線程應用程序的情況下,CPU應該是1/#numberCore,在您的情況下是25%。



If it's a release build your compiler may be vectorising parallelise your code. Also libraries you link against, say the standard library for example, may be threaded or vectorised.




粤ICP备14056181号  © 2014-2021 ITdaan.com