### 如何使用NumPy計算移動平均線？

#### [英]How to calculate moving average using NumPy?

There seems to be no function that simply calculates the moving average on numpy/scipy, leading to convoluted solutions.

My question is two-fold:

• What's the easiest way to (correctly) implement a moving average with numpy?
• 用numpy（正確）實現移動平均線的最簡單方法是什么？
• Since this seems non-trivial and error prone, is there a good reason not to have the batteries included in this case?
• 由於這似乎並非易事且容易出錯，因此有充分的理由不在這種情況下使用電池嗎？

## 2 个解决方案

### #1

105

If you just want a straightforward non-weighted moving average, you can easily implement it with `np.cumsum`, which may be is faster than FFT based methods:

EDIT Corrected an off-by-one wrong indexing spotted by Bean in the code. EDIT

``````def moving_average(a, n=3) :
ret = np.cumsum(a, dtype=float)
ret[n:] = ret[n:] - ret[:-n]
return ret[n - 1:] / n

>>> a = np.arange(20)
>>> moving_average(a)
array([  1.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.,  11.,
12.,  13.,  14.,  15.,  16.,  17.,  18.])
>>> moving_average(a, n=4)
array([  1.5,   2.5,   3.5,   4.5,   5.5,   6.5,   7.5,   8.5,   9.5,
10.5,  11.5,  12.5,  13.5,  14.5,  15.5,  16.5,  17.5])
``````

So I guess the answer is: it is really easy to implement, and maybe numpy is already a little bloated with specialized functionality.

### #2

62

NumPy's lack of a particular domain-specific function is perhaps due to the Core Team's discipline and fidelity to NumPy's prime directive: provide an N-dimensional array type, as well as functions for creating, and indexing those arrays. Like many foundational objectives, this one is not small, and NumPy does it brilliantly.

NumPy缺乏特定的特定於域的功能可能是由於Core Team的紀律和對NumPy主要指令的保真度：提供N維數組類型，以及創建和索引這些數組的函數。像許多基本目標一樣，這個目標並不小，而且NumPy的表現非常出色。

The (much) larger SciPy contains a much larger collection of domain-specific libraries (called subpackages by SciPy devs)--for instance, numerical optimization (optimize), signal processsing (signal), and integral calculus (integrate).

（更大）的SciPy包含更大的特定於域的庫（由SciPy開發人員稱為子包） - 例如，數值優化（優化），信號處理（信號）和積分微積分（積分）。

My guess is that the function you are after is in at least one of the SciPy subpackages (scipy.signal perhaps); however, i would look first in the collection of SciPy scikits, identify the relevant scikit(s) and look for the function of interest there.

Scikits are independently developed packages based on NumPy/SciPy and directed to a particular technical discipline (e.g., scikits-image, scikits-learn, etc.) Several of these were (in particular, the awesome OpenOpt for numerical optimization) were highly regarded, mature projects long before choosing to reside under the relatively new scikits rubric. The Scikits homepage liked to above lists about 30 such scikits, though at least several of those are no longer under active development.

Scikits是基於NumPy / SciPy獨立開發的軟件包，並針對特定的技術學科（例如，scikits-image，scikits-learn等）其中一些（特別是用於數值優化的令人敬畏的OpenOpt）受到高度重視，成熟的項目早在選擇居住在相對較新的scikits標題之前。 Scikits主頁上面列出了大約30個這樣的scikits，但其中至少有幾個不再處於積極開發階段。

Following this advice would lead you to scikits-timeseries; however, that package is no longer under active development; In effect, Pandas has become, AFAIK, the de facto NumPy-based time series library.

Pandas has several functions that can be used to calculate a moving average; the simplest of these is probably rolling_mean, which you use like so:

Pandas有幾個可用於計算移動平均值的函數;其中最簡單的可能是rolling_mean，你可以這樣使用：

``````>>> # the recommended syntax to import pandas
>>> import pandas as PD
>>> import numpy as NP

>>> # prepare some fake data:
>>> # the date-time indices:
>>> t = PD.date_range('1/1/2010', '12/31/2012', freq='D')

>>> # the data:
>>> x = NP.arange(0, t.shape[0])

>>> # combine the data & index into a Pandas 'Series' object
>>> D = PD.Series(x, t)
``````

Now, just call the function rolling_mean passing in the Series object and a window size, which in my example below is 10 days.

``````>>> d_mva = PD.rolling_mean(D, 10)

>>> # d_mva is the same size as the original Series
>>> d_mva.shape
(1096,)

>>> # though obviously the first w values are NaN where w is the window size
>>> d_mva[:3]
2010-01-01         NaN
2010-01-02         NaN
2010-01-03         NaN
``````

verify that it worked--e.g., compared values 10 - 15 in the original series versus the new Series smoothed with rolling mean

``````>>> D[10:15]
2010-01-11    2.041076
2010-01-12    2.041076
2010-01-13    2.720585
2010-01-14    2.720585
2010-01-15    3.656987
Freq: D

>>> d_mva[10:20]
2010-01-11    3.131125
2010-01-12    3.035232
2010-01-13    2.923144
2010-01-14    2.811055
2010-01-15    2.785824
Freq: D
``````

The function rolling_mean, along with about a dozen or so other function are informally grouped in the Pandas documentation under the rubric moving window functions; a second, related group of functions in Pandas is referred to as exponentially-weighted functions (e.g., ewma, which calculates exponentially moving weighted average). The fact that this second group is not included in the first (moving window functions) is perhaps because the exponentially-weighted transforms don't rely on a fixed-length window

Rolling_mean函數以及大約十幾個其他函數在Rubric移動窗口函數下的Pandas文檔中非正式地分組; Pandas中的第二個相關函數組稱為指數加權函數（例如，ewma，其計算指數移動加權平均值）。第二組未包含在第一組（移動窗口函數）中的事實可能是因為指數加權變換不依賴於固定長度的窗口