R取一个变量在另一个数值变量的区间上的平均值

np8igboo  于 2023-03-10  发布在  其他
关注(0)|答案(2)|浏览(164)

当间隔的行数并不总是相等时,我将如何计算第2列在第1列中每个x间隔的平均值?
这看起来很简单,但我不知道从哪里开始。

df <- data.frame(dist = c(0.06,0.22,0.38,0.44,0.5,0.52,0.6,0.74,0.76,0.88,0.92,0.94,1,1.18,1.26,1.3,1.4,1.48,1.5), 
            value = c(12,54.6,46.6,59.7,65.4,66.4,67,76.5,77.3,94.5,95.5,95,93.7,106.5,112.3,112.4,112.6,114.3,114.2))

假设我想知道列1从0到0.5,然后从0.5到1到1.5,以此类推时列2的块平均值,但如果0到0.5是5行,0.5到1是9行,那么在不指定行号的情况下,最好的方法是什么?
我试过搜索,但也许我没有使用正确的关键词。

sd2nnvve

sd2nnvve1#

base R中使用aggregate

aggregate(value ~ grp, transform(df, grp = cut(dist, seq(0, 1.5, .5))), mean)
  • 输出
grp    value
1 (0,0.5]  47.6600
2 (0.5,1]  83.2375
3 (1,1.5] 112.0500
v64noz0r

v64noz0r2#

可以使用cut根据dist的值进行分组:

tapply(df$value, cut(df$dist, seq(0, 1.5, .5)), FUN = mean)
# (0,0.5]  (0.5,1]  (1,1.5] 
# 47.6600  83.2375 112.0500

或者,如果您更喜欢dplyr

df %>% 
  group_by(gp = cut(dist, seq(0, 1.5, .5))) %>% 
  summarise(mean = mean(value)) %>%
  ungroup()

#       gp     mean
#1 (0,0.5]  47.6600
#2 (0.5,1]  83.2375
#3 (1,1.5] 112.0500

相关问题