使用Lubridate产生NA值

dphi5xsq  于 2022-12-30  发布在  其他
关注(0)|答案(2)|浏览(133)

我有这个数据框(df)

date               Name    Name_id      x1      x2     x3    x4    x5    x6   
01/01/2000 00:00      A       U_12       1       1      1     1     1     1
01/01/2000 01:00      A       U_12
01/01/2000 02:00
01/01/2000 03:00
....

我尝试使用lubridate计算一些列的每月汇总平均值等。
到目前为止我做了什么

df$date <- dmy_hm(Sites_tot$date)

df$month <- floor_date(df$date,"month")

monthly_avgerage <- df %>%
  group_by(Name, Name_id, month) %>%
  summarize_at(vars(x1:x4), .funs = c("mean", "min", "max"), na.rm = TRUE)

我可以看到的价值观似乎还可以,虽然有些月份变成了NA。

9gm1akwq

9gm1akwq1#

我们可以将summarise_at修改为

library(dplyr)
df %>%
   group_by(Name, Name_id, month) %>%
   summarise(across(x1:x4,  list(mean = ~ mean(.x, na.rm = TRUE), 
                         min = ~ min(.x, na.rm = TRUE),
                         max = ~ max(.x, na.rm = TRUE))))

一个可重复的例子

iris %>%
   group_by(Species) %>%
   summarise(across(everything(),  list(mean = ~ mean(.x, na.rm = TRUE), 
                         min = ~ min(.x, na.rm = TRUE),
                         max = ~ max(.x, na.rm = TRUE))))
chhqkbe1

chhqkbe12#

如果我没猜错的话,挑战是将日期列转换为日期时间格式:
date = dmy_hm(date)不起作用:

library(dplyr)
library(lubridate)

df %>% 
  mutate(date = dmy_hms(paste0(date, ":00")),
         month = month(date)) %>% 
  group_by(Name, Name_id, month) %>%
  summarise(across(x1:x4,  list(mean = ~ mean(.x, na.rm = TRUE), 
                                min = ~ min(.x, na.rm = TRUE),
                                max = ~ max(.x, na.rm = TRUE))), .groups = "drop")
Name  Name_id month x1_mean x1_min x1_max x2_mean x2_min x2_max x3_mean x3_min x3_max
  <chr> <chr>   <dbl>   <dbl>  <int>  <int>   <dbl>  <int>  <int>   <dbl>  <int>  <int>
1 A     U_12        1     1.5      1      2     1.5      1      2     1.5      1      2
2 B     U_13        1     3.5      3      4     3.5      3      4     3.5      3      4
# … with 3 more variables: x4_mean <dbl>, x4_min <int>, x4_max <int>
# ℹ Use `colnames()` to see all variable names

伪造数据:

df <- structure(list(date = c("01/01/2000 00:00", "01/01/2000 01:00", 
"01/01/2000 02:00", "01/01/2000 03:00"), Name = c("A", "A", "B", 
"B"), Name_id = c("U_12", "U_12", "U_13", "U_13"), x1 = 1:4, 
    x2 = 1:4, x3 = 1:4, x4 = 1:4, x5 = 1:4, x6 = 1:4), class = "data.frame", row.names = c(NA, 
-4L))

相关问题