Python中的時間復雜性 - 大O符號

[英]Time Complexity in Python - Big O notation

The output of my program look like:


[[1000, 1500, 2000, 2500, 3000, 3500, 4000],
[437, 680, 917, 1115, 1476, 1668, 1912]]

It's two-dimensional array created via Numpy Library. In the first row is number of N which I pass to the function, and the second row is time-measurment of this function in micro-seconds (e.g. N=1000 time=437, N=1500 time=680).

它是通過Numpy Library創建的二維數組。在第一行中是我傳遞給函數的N的數量,第二行是以微秒為單位的該函數的時間測量(例如,N = 1000時間= 437,N = 1500時間= 680)。

Is there any simple way to determine which is the complexity of this function? I know I can paint a plot and just see this but my app needs to give me just the answer (Your function is (probably, of course) O(n) or O(n log n) or O(n^2)).

有沒有簡單的方法來確定這個功能的復雜性?我知道我可以繪制一個情節,只是看到這個,但我的應用程序需要給我答案(你的功能(當然可能是)O(n)或O(n log n)或O(n ^ 2)) 。

O(n) seems to be preety obvious - I just need to divide N/t for all array and check if it is constant, but I have no idea how to check another two?

O(n)似乎很明顯 - 我只需要為所有數組除N / t並檢查它是否是常數,但我不知道如何檢查另外兩個?

1 个解决方案



One can do all sorts of fancy curve fitting and model evaluation with sklearn. But for a simple approach, measuring the variance of the logarithm of the thing that's expected to be constant will do. That is,


  1. Take the ratio such as T/N, or T/(N*log(N)), or T/N**2. We'd like this to be constant.
  2. 取T / N或T /(N * log(N))或T / N ** 2之比。我們希望這是不變的。

  3. Take the logarithm of that ratio, to remove the effects of scaling.
  4. 取該比率的對數,以消除縮放的影響。

  5. Compute the variance across the data points, with np.var. The model with the smallest variance wins.
  6. 使用np.var計算數據點之間的差異。方差最小的模型獲勝。

For your example:


import numpy as np
n = np.array([1000, 1500, 2000, 2500, 3000, 3500, 4000])
t = np.array([437, 680, 917, 1115, 1476, 1668, 1912])
print(np.var(np.log(t/n)))              # 0.001545...
print(np.var(np.log(t/(n*np.log(n)))))  # 0.001348...
print(np.var(np.log(t/(n**2))))         # 0.18049...

So it's definitely not quadratic. The N*log(N) is a slightly better fit than linear. (Trying a few other things, it seems N*sqrt(log(N)) is best.)

所以它絕對不是二次方的。 N * log(N)比線性稍微好一點。 (嘗試其他一些事情,似乎N * sqrt(log(N))是最好的。)



  © 2014-2022