R语言 如何跨列累计求和

6vl6ewon  于 2023-05-04  发布在  其他
关注(0)|答案(6)|浏览(157)

我是R的新手,想修改一个数据集,使每一列都包含它左边所有列(包括它自己)中值的累积和。我知道如何使用rowSums分别计算每一列的累计和:

df <- data.frame(
  jan = rep(1:2, each = 3),
  feb = rep(1:3, each = 2),
  mar = rep(5:4, each = 3),
  apr = rep(1:3, each = 2)
)
df

df %>%
  mutate(feb = rowSums(subset(., select = (jan:feb))),
         mar = rowSums(subset(., select = (jan:mar))),
         apr = rowSums(subset(., select = (jan:apr))))

它产生了我想要的输出:

jan feb mar apr
1   1   2   7   8
2   1   2   7   8
3   1   3   8  10
4   2   4   8  10
5   2   5   9  12
6   2   5   9  12

我如何将其推广到任意数量的列?我一直在尝试这样的陈述:

df %>% mutate_at(vars(-jan), ~rowSums(subset(., select = (jan:.))))

但是我没有正确使用subset。如果你能帮上忙的话,我提前表示感谢。

sqougxex

sqougxex1#

这有点晚了,但是如果你想保持在tidyverse语法内,你可以使用旋转到更长的格式,按组求和,然后重新构建更宽的格式的组合:

df %>% 
  rowid_to_column("ID") %>%  #Create a ID column
  pivot_longer(cols = - ID) %>% 
  group_by(ID) %>% #Inteify rows as groups
  mutate(CumSum = cumsum(value)) %>% #Do the cumsum by groups
  pivot_wider(id_cols = ID, names_from = name, values_from = CumSum) #Reconstruct the wider format

干杯

kuarbcqp

kuarbcqp2#

它不清楚你在问什么,你应该提供一个示例输出。这个有用吗

> cumsum(colSums(df))
jan feb mar apr 
  9  21  48  60

还是这个

new_df <- df

for(i in 1:nrow(df)){
    new_df[i,] <- cumsum(unlist(df[i,]))
}

> new_df
  jan feb mar apr
1   1   2   7   8
2   1   2   7   8
3   1   3   8  10
4   2   4   8  10
5   2   5   9  12
6   2   5   9  12
wnavrhmk

wnavrhmk3#

我明白这些列应该累计求和。就像这样:

cum.df = sapply(1:ncol(df), function(col){
    rowSums(df[1:col])
})
[,1] [,2] [,3] [,4]
[1,]    1    2    7    8
[2,]    1    2    7    8
[3,]    1    3    8   10
[4,]    2    4    8   10
[5,]    2    5    9   12
[6,]    2    5    9   12

对吗?

pod7payv

pod7payv4#

以下是Reduce的替代方案:

do.call(cbind,Reduce(`+`,lapply(df,`[`,),accumulate = TRUE))
     [,1] [,2] [,3] [,4]
[1,]    1    2    7    8
[2,]    1    2    7    8
[3,]    1    3    8   10
[4,]    2    4    8   10
[5,]    2    5    9   12
[6,]    2    5    9   12
abithluo

abithluo5#

apply ing cumsum

t(apply(df, 1, cumsum))
#      jan feb mar apr
# [1,]   1   2   7   8
# [2,]   1   2   7   8
# [3,]   1   3   8  10
# [4,]   2   4   8  10
# [5,]   2   5   9  12
# [6,]   2   5   9  12
yzxexxkh

yzxexxkh6#

matrixStats中的rowCumsums的另一个选项

library(matrixStats)
rowCumsums(as.matrix(df))

相关问题