如何在scipy中模擬2-樣本t檢驗。

[英]howto emulate 2-sample t-test in scipy


I'm trying to emulate MS Excel's t-probe function in Python. I need to do this because I have to automate some calculations there were previously done in Excel. Here is my test program:

我試圖在Python中模擬MS Excel的t-probe函數。我需要這樣做,因為我需要自動化一些之前在Excel中做過的計算。這是我的測試程序:

import scipy.stats
a = [5, 0.9,  -0.4, -0.9, 0.5, 0.8, 0.2, 0.2, 0, -0.8]
b = [1.1, 0.9, -0.5, -0.7, 0.6, 0.7, 0.3, 0.1, -0.1, -0.7]

print scipy.stats.ttest_ind(a,b, equal_var=True)

This is the result:

這是由於:

(array(0.6661542796363409), 0.51376033318001801)

However, Excel gives this value for the same input: 0.35844407

但是,Excel為相同的輸入提供了這個值:0.35844407。

I noticed that they have used tail=2 parameter (see http://office.microsoft.com/en-us/excel-help/ttest-HP005209325.aspx ). Unfortunately, I have no idea how to calculate two tailed t-test with scipy. (In fact I don't know what it is.)

我注意到他們使用了tail=2參數(參見http://office.microsoft.com/en-us/excel-help/ttest-HP005209325.aspx)。不幸的是,我不知道如何用scipy來計算雙尾t檢驗。(實際上我不知道它是什么。)

Another very strange thing is that in scipy, I get a sightly different result when I change the order of samples. E.g. if I move -0.7 to the head of b, then I get 0.51376033318001824 instead of 0.51376033318001801. Not a big difference, but still.

另一件很奇怪的事情是,在scipy中,當我改變樣本的順序時,我得到了一個很明顯的不同結果。如果我移動-0.7到b的頭,那么我得到的是0。51376033318001824而不是0。51376033318001801。差別不大,但依然如此。

For Excel, it is a whole new story - looks like the two tailed t-test gives a significantly different result when the order of samples is different.

對於Excel來說,這是一個全新的故事——看起來,當樣本的順序不同時,兩個尾部t檢驗的結果會有很大的不同。

The question is: how can I emulate Excel's version of two tailed t-test in scipy?

問題是:我怎樣才能在scipy中模擬Excel版本的雙尾t測試?

1 个解决方案

#1


5  

It looks like Excel is computing ttest_rel:

它看起來像Excel計算ttest_rel:

In [15]: import scipy.stats as stats

In [20]: stats.ttest_rel(a, b)
Out[20]: (array(0.9677712267394081), 0.35844406902161985)

Use stats.ttest_rel when a and b are related. The docs say:

使用統計數據。當a和b是相關的時,ttest_rel。醫生說:

Examples for the use [of ttest_rel] are scores of the same set of student in different exams, or repeated sampling from the same units.

使用[ttest_rel]的例子是同一組學生在不同的考試中的分數,或者相同單元的重復抽樣。

Use stats.ttest_ind when a and b are independent.

使用統計數據。當a和b是獨立的時。

We can use [ttest_ind], if we observe two independent samples from the same or different population, e.g. exam scores of boys and girls or of two ethnic groups.

我們可以使用[ttest_ind],如果我們觀察來自相同或不同人群的兩個獨立樣本,例如,男孩和女孩的考試成績,或兩個民族的考試成績。


注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2013/07/19/72f04dc53d80a04fdc27ea2fbbb5358f.html



 
粤ICP备14056181号  © 2014-2021 ITdaan.com