我有数百个Excel文件,其中包含多个具有混合日期时间格式的列,包括小数。一些变量名称在文件之间共享,而一些则不是。我需要将所有这些数据集绑定在一起,并确保每个日期时间列(当前作为字符读入)都是POSIX。
样本数据和目标:
df1 <- tibble::tribble(
~v1, ~v2, ~v3, ~v4,
"3/8/2022 22:59", "A", "44629.85139", "E",
"3/9/22 19:24", "B", "44629.88056", "F",
"44629.81389", "C", "3/8/2022 20:25", "G",
"44629.81389", "D", "3/8/2022 20:31", "H"
)
df2 <- tibble::tribble(
~v1, ~v2, ~v5, ~v6,
"3/8/2022 22:59", "I", "44629.85139", "M",
"44629.81389", "J", "3/7/2022 21:38", "N",
"44629.81389", "K", "3/7/2022 21:44", "O",
"44629.81389", "L", "3/7/2022 21:48", "P"
)
goal <- tibble::tribble(
~v1, ~v2, ~v3, ~v4, ~v5, ~v6,
"2022-03-08 22:59:00", "A", "2022-03-09 20:26:00", "E", NA, NA,
"2022-03-09 19:24:00", "B", "2022-03-09 21:08:00", "F", NA, NA,
"2022-03-09 19:32:00", "C", "2022-03-08 20:25:00", "G", NA, NA,
"2022-03-09 19:32:00", "D", "2022-03-08 20:31:00", "H", NA, NA,
"2022-03-08 22:59:00", "I", NA, NA, "2022-03-09 20:26:00", "M",
"2022-03-09 19:32:00", "J", NA, NA, "2022-03-07 21:38:00", "N",
"2022-03-09 19:32:00", "K", NA, NA, "2022-03-07 21:44:00", "O",
"2022-03-09 19:32:00", "L", NA, NA, "2022-03-07 21:48:00", "P"
)
解决方案here将十进制时间转换为各个列的正确格式,但当然不适用于其他格式:
> as.POSIXct("1900-01-01") + as.difftime(as.numeric(df1$v1), units = "days")
[1] NA
[2] NA
[3] "2022-03-11 19:32:00 EST"
[4] "2022-03-11 19:32:00 EST"
Warning message:
In as.difftime(as.numeric(df1$v1), units = "days") : NAs introduced by coercion
当文件中的变量不同时,如何为每个数据集转换多个混合格式的datetime列?我尝试了lubridate::parse_date_time
,但无法找出如何将十进制转换适合orders
参数。非常感谢任何帮助!
1条答案
按热度按时间bkhjykvo1#
由reprex package(v2.0.1)于2023-03-23创建