為什么我的AsOrdered PLINQ查詢比無序查詢快

[英]Why is my AsOrdered PLINQ query faster than my unordered one


I wrote some basic sample code to familiarise myself with PLINQ.

為了熟悉PLINQ,我編寫了一些基本的示例代碼。

I came across something weird. I don't know if it's an error in my code or an error in my understanding of PLINQ.

我發現了一些奇怪的東西。我不知道這是我代碼中的錯誤,還是我理解PLINQ的錯誤。

The MSDN documentation states that adding AsOrdered() will preserve the order of the call at the possible cost of performance.

MSDN文檔說明添加AsOrdered()將保持調用的順序,以可能的性能代價。

I wrote some unit tests and noticed the effect on the order on the result set as stated in the documentation. But I have seen the inverse effect on performance.

我編寫了一些單元測試,並注意到文檔中所述的結果集上的順序的影響。但我看到了對性能的反向影響。

Here are both my method:

以下是我的方法:

public IEnumerable<int> ParallelCalculatePrimesUpTo(int maxValue)
{
    return from number in Enumerable.Range(1, maxValue).AsParallel()
            where IsPrime(number)
            select number;
}

public IEnumerable<int> OrderedParallelCalculatePrimesUpTo(int maxValue)
{
    return from number in Enumerable.Range(1, maxValue).AsParallel().AsOrdered()
            where IsPrime(number)
            select number;
}

And my very simple benchmarks

我非常簡單的基准

    [TestMethod]
    public void SimplisticBenchmark6()
    {
        var primeNumberCalculator = new PrimeNumberCalculator();

        var startTime = DateTime.Now;

        primeNumberCalculator.ParallelCalculatePrimesUpTo(10000000).ToList();

        var totalTime = DateTime.Now - startTime;

        Console.WriteLine(totalTime);
    }

    [TestMethod]
    public void SimplisticBenchmark7()
    {
        var primeNumberCalculator = new PrimeNumberCalculator();

        var startTime = DateTime.Now;

        primeNumberCalculator.OrderedParallelCalculatePrimesUpTo(10000000).ToList();

        var totalTime = DateTime.Now - startTime;

        Console.WriteLine(totalTime);
    }

No matter how often I run this test, the ordered version beats out the unordered one. I get about 4 seconds quicker for the ordered one on my quad core computer. I am getting about 18 seconds for the ordered one and 22 seconds for the unordered one. I have run the tests dozens of time over the course of two days (with reboots between those days).

無論我多久運行一次這個測試,有序版本都勝過無序版本。在我的四軸核心計算機上,我可以快4秒。有序的1有18秒,無序的1有22秒。在兩天的時間里,我已經做了很多次測試(在那段時間里,我一直在重新引導)。

If I lower the number 10 000 000 to 6 000 000, the differences is still there but less noticeable and if I lower it to 3 000 000, it is about the same speed.

如果我降低10萬到6萬的數量,差別仍然存在,但不那么明顯,如果我把它降低到3萬,它的速度是一樣的。

I tried running the tests in both order of execution and the results are the same.

我嘗試以執行的順序和結果相同的方式運行測試。

Here is the IsPrime method that gets called in the PLINQ query:

下面是在PLINQ查詢中被調用的IsPrime方法:

// uses inneficient trial division algorithm
private bool IsPrime(int number)
{
    if (number == 1)
        return false;

    for (int divisor = 2; divisor <= Math.Sqrt(number); divisor++)
    {
        if (number % divisor == 0)
            return false;
    }

    return true;
}

What explains this?

這作何解釋呢?

2 个解决方案

#1


3  

Do You Always Run The Tests In The Same Order?

你總是按同樣的順序運行測試嗎?

I've recreated your results on my machine and I found that, no matter, the 'Ordered' results were faster. I used slightly modified code for benchmarking:

我在我的機器上重新創建了您的結果,我發現,無論如何,“有序”結果都更快。我使用了稍微修改過的代碼進行基准測試:

static void Main(string[] args)
{
    const int size = 9000000;
    BenchIt("Parallel", ParallelCalculatePrimesUpTo, size);
    BenchIt("Ordered ", OrderedParallelCalculatePrimesUpTo, size);
    Console.ReadKey();
}

public static void BenchIt(string desc, Func<int, IEnumerable<int>> myFunc, int size)
{
    var sw = new Stopwatch();            
    sw.Restart();
    myFunc.Invoke(size).ToList();
    sw.Stop();
    Console.WriteLine("{0} {1}",desc, sw.Elapsed);
}

My results showed, initially, that you were correct. The ordered method was faster. However, if I SWAPPED the order of the calls, I found that the non-ordered method was faster. In other words, whichever one went second was faster. Presumably, because of the thread-pool management that the Task Parallel Library is doing.

我的結果顯示,最初,你是正確的。有序方法更快。但是,如果我交換調用的順序,我發現非有序方法更快。換句話說,第二名跑得更快。大概是因為任務並行庫所做的線程池管理。

But - the differences between the two on my machine were very small. Nowhere near the amount of difference you saw.

但是-我的機器上這兩個的區別很小。與你看到的差別相差甚遠。

What's Your Hardware Look Like?

你的硬件是什么樣子的?

PLINQ does some guessing about how to execute the fastest. I don't know if this will directly help you in this case; but you might want to set a break point in the middle of IsPrime and stop on it after a few hundred iterations and examine the thread window.

PLINQ對如何最快執行進行了一些猜測。我不知道這對你是否有幫助;但是,您可能希望在IsPrime中間設置一個斷點,並在數百次迭代之后停止該斷點,並檢查線程窗口。

How many threads do you have when executing ParallelCalculatedPrimesUpTo verse OrderedParallelCalculatePrimesUpTo? I'm reaching here; but it's possible that it's deciding on different values on your machine that creates the unexpected times you are seeing. On my machine - I get eight threads, each time - but my times are NEARLY identical - whichever one is called first is slower because of the creation of those threads. But you aren't guaranteed a particular number of threads (you can set the maximum, but you can't force them to be used).

執行ParallelCalculatedPrimesUpTo verse序列時,你有多少個線程?我到達;但有可能是您的機器上決定了不同的值,從而創建了您所看到的意想不到的時間。在我的機器上——我每次都有8個線程——但是我的時間幾乎是相同的——因為創建這些線程,首先調用的線程會比較慢。但是不能保證一定數量的線程(可以設置最大線程數,但是不能強制使用它們)。

#2


1  

Can you tell us what the CPU utilization is across the 4 different cores? It's possible that AsOrdered() is forcing more sequential calls to happen on the same core. With improved locality, silicon-level caching and branch prediction may be working in your favor.

你能告訴我們4個不同內核的CPU利用率是多少嗎?AsOrdered()可能會強制在同一核心上執行更多的順序調用。有了改進的局部性,硅級緩存和分支預測可能對您有利。

Another possibility is that there's some optimization in the .NET framework for the case of monotonically increasing integers (int.Range) when using the AsOrdered() projection. I'm not sure how that would work, but it's possible.

另一種可能性是,當使用AsOrdered()投影時,. net框架中對於單調遞增整數(int.Range)有一些優化。我不知道這是怎么回事,但這是可能的。

An interesting test for comparison would be to generate a third set of numbers, in random order (obviously, you'd have to randomize them ahead of time and then work off three arrays). Then see if that has anything to do with it?

比較的一個有趣的測試是,以隨機順序生成第三組數字(顯然,您必須提前將它們隨機化,然后處理三個數組)。然后看看這和它有什么關系?


注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2012/06/06/724e4a0661952b823f92e2f21c79f7b6.html



 
粤ICP备14056181号  © 2014-2020 ITdaan.com