我想按组计算乘积之和,但每组的行数不同。以下是我的tibble
d<-c("2019-01-22", "2019-02-05", "2019-02-19" ,"2019-02-19" ,"2019-03-07" ,"2019-03-19" ,"2019-03-19" ,"2019-04-02" ,"2019-04-16",
"2019-04-16" ,"2019-04-30" ,"2019-05-14" ,"2019-05-14" ,"2019-05-27" ,"2019-01-22" ,"2019-02-05" ,"2019-02-19",
"2019-02-19" ,"2019-03-07" ,"2019-03-19" ,"2019-03-19" ,"2019-04-02" ,"2019-04-16" ,"2019-04-16" ,"2019-04-30" ,"2019-05-14",
"2019-05-14" ,"2019-05-27")
mat<-rep(c("092000884483","092000884505"),each=14)
mung<-c("M" ,"M" ,"M" ,"S" ,"M" ,"M" ,"S" ,"M" ,"M" ,"S" ,"M" ,"M" ,"S" ,"M" ,"M" ,"M" ,"M" ,"S" ,"M" ,"M" ,"S" ,"M" ,"M" ,"S" ,"M" ,"M" ,"S" ,"M")
Tg<-c(5.42,4.40,6.39,7.79,3.77,4.65,3.26,5.42,4.17,5.33,4.65,6.43,9.68,8.10,6.68,4.46,6.37,8.90,3.79,5.59,6.66,6.06,6.28,9.48,6.00,6.24,10.48,8.31)
C4<-c(4.29, 5.07, 4.45, 4.15, 4.24, 3.78, 3.62, 4.16, 3.84, 3.54, 3.80, 3.77, 3.93, 3.70, 4.00, 4.22, 4.36, 4.04, 3.92, 3.69, 3.64, 4.27, 3.59, 3.91, 3.84, 3.74, 4.04, 3.01)
my_tbl<-tibble(Matricola=mat,datc=as.Date(d),Mung=mung,tg=Tg,C4_0=C4)
我需要每个日期和每个矩阵的乘积tg*C4_0的总和。如果我手工计算乘积的总和,我会执行以下操作
my_tbl_t<-my_tbl%>%pivot_wider(id_cols = c(Matricola,datc),values_from =c(tg,C4_0),names_from = Mung )
#and calculate the sum of the prodcuts, conditioning to "missing" data
my_prd1<-my_tbl_t%>%mutate(C4_0ps1=case_when(is.na(tg_M)==F & is.na(tg_S)==F~(tg_M*C4_0_M+tg_S*C4_0_S),
is.na(tg_M)==F & is.na(tg_S)==T~(tg_M*C4_0_M),
is.na(tg_M)==T & is.na(tg_S)==F~(tg_S*C4_0_S)))
或者,我可以先计算产品和汇总在Matricola和日期如下
my_tbl%<>%mutate(C4_0p=C4_0*tg)
#and summarise by group
my_prd2<-my_tbl%>%group_by(Matricola,datc)%>%
summarise(n=n(),C4_0ps2=sum(C4_0p,n.rm=T))
我原以为my_prd1中的C4_0ps1与my_prd2中的变量C4_0ps2相同,但事实并非如此,因为my_prd2中的乘积之和比my_prod1中的乘积之和高(1个单位)。我看到my_prd2仍按Matricola分组,但我不明白为什么乘积之和是错误的。
1条答案
按热度按时间j9per5c41#
我发现了错误!而不是这个代码,其中有一个错别字:
我应该这样写,它有正确的
na.rm
参数: