Scipy Stats ttest_1samp用于比较先前性能与样本的假设检验

wixjitnu 于 2022-11-09 发布在其他

关注(0)|答案(1)|浏览(159)

"我想解决的问题"
我有11个月的绩效数据：

Month  Branded  Non-Branded  Shopping  Grand Total
0    2/1/2015     1330          334       161         1825
1    3/1/2015     1344          293       197         1834
2    4/1/2015      899          181       190         1270
3    5/1/2015      939          208       154         1301
4    6/1/2015     1119          238       179         1536
5    7/1/2015      859          238       170         1267
6    8/1/2015      996          340       183         1519
7    9/1/2015     1138          381       172         1691
8   10/1/2015     1093          395       176         1664
9   11/1/2015     1491          426       199         2116
10  12/1/2015     1539          530       156         2225

假设现在是2016年2月1日，我会问：“1月份的结果与过去11个月的结果在统计学上有什么不同吗？”

Month  Branded  Non-Branded  Shopping  Grand Total
11  1/1/2016     1064          408       106         1578

"我偶然发现了一个博客"
我偶然发现了iaingallagher的博客。我将在这里复制（以防博客崩溃）。

单样本t检验

单样本t检验用于比较样本平均值与总体平均值（我们已经知道）。英国男性的平均身高为175.3厘米。一项调查记录了10名英国男性的身高，我们想知道样本平均值是否与总体平均值不同。


# 1-sample t-test

from scipy import stats
one_sample_data = [177.3, 182.7, 169.6, 176.3, 180.3, 179.4, 178.5, 177.2, 181.8, 176.5]

one_sample = stats.ttest_1samp(one_sample_data, 175.3)

print "The t-statistic is %.3f and the p-value is %.3f." % one_sample

结果：

The t-statistic is 2.296 and the p-value is 0.047.

"最后，我的问题是..."
在iaingallagher的例子中，他知道总体平均值，并且正在比较一个样本（one_sample_data）。在我的例子中，我想看看1/1/2016是否与前11个月有统计学上的不同。所以在我的例子中，前11个月是一个数组（而不是单个总体平均值），而我的样本是一个数据点（而不是数组）......所以这有点向后。

问题

如果我关注的是Shopping列数据：
scipy.stats.ttest_1samp([161,197,190,154,179,170,183,172,176,199,156], 106)是否会产生有效的结果，即使我的样本（第一个参数）是以前结果的列表，并且我将其与popmean（不是总体平均值，而是一个样本）进行比较。
如果这不是正确的统计函数，那么对于这种假设检验情况，有什么建议吗？

scipy

来源：https://stackoverflow.com/questions/35788140/scipy-stats-ttest-1samp-hypothesis-testing-for-comparing-previous-performance-to

1条答案

按热度按时间

hs1ihplo1#

如果您只对"Shopping"列感兴趣，请尝试创建一个.xlsx或.csv文件，其中只包含"Shopping"列中的数据。
这样，您就可以导入这些数据，并利用Pandas分别对每列执行相同的T检验。

import pandas as pd
from scipy import stats
data = pd.read_excel("datafile.xlxs")
    one_sample_data = data["Shopping"]

    one_sample = stats.ttest_1samp(one_sample_data, 175.3)

赞(0）回复(0）举报 2022-11-09

我来回答

Scipy Stats ttest_1samp用于比较先前性能与样本的假设检验

1条答案

相关问题

热门标签

最新问答