以下是我的数据:
df <- data.frame(
id=c("1", "1", "2", "2", "3", "4"),
tube_placement=c("2020-01-01", "2020-01-10", "2020-01-01", "2020-01-15", "2020-01-01", "" ),
tube_removal = c("2020-01-02", "2020-01-12", "2020-01-02", "", "2020-01-02", ""),
attempts = c(1, 2, 1, "", 1, "")
)
df[df==""] <- NA
df$attempts <- as.numeric(df$attempts)
我想计算一个新列“total_attempts”如下:
id tube_placement tube_removal attempts total_attempts
1 2020-01-01 2020-01-02 1 3
1 2020-01-10 2020-01-12 2 3
2 2020-01-01 2020-01-02 1 1
2 2020-01-15 NA NA 1
3 2020-01-01 2020-01-02 1 1
4 NA NA NA NA
我尝试了以下方法:
df1 <- df %>% group_by(id) %>%
mutate(total_attempts = sum(attempts))
问题是它无法对其中包含NA的ID求和。有什么建议或提议来改进我的代码吗?
1条答案
按热度按时间1qczuiv01#
您可以使用na.rm = TRUE从计算中删除缺失数据,如下所示: