我正试图创建一个列,以R中的季节和年份来标识遥测检测

wtlkbnrh  于 2023-09-27  发布在  其他
关注(0)|答案(1)|浏览(70)

我有一个遥测数据集,我正试图将其与我拥有的食物可用性数据集合并。为此,我需要在遥测数据中创建一个列,该列通过其定义的季节和该季节的年份来标识检测。例如,2021年12月至2022年2月将是冬季('21)。

Receiver Date       Time  TOA   TagID Type  Value Power timestamp           BATCH Beacon Serial. Release Relea…¹ River…²   RKM Locale GPS.C…³ X     Active INACT…⁴ Init.…⁵ Last.…⁶
  <chr>    <chr>      <chr> <chr> <chr> <chr> <chr> <chr> <dttm>              <chr>  <int>   <int> <chr>     <int>   <dbl> <dbl> <chr>  <chr>   <chr> <chr>  <chr>   <chr>   <chr>  
1 3024     11/23/2021 15:5… 0.75… 39000 T     16.4  516   2021-11-23 15:54:42 MF B…   3024      NA ""           NA    184.  296. ""     N 32.1… W 08… Yes    ""      3/1/21  1/14/22
2 3024     11/23/2021 15:4… 0.24… 39100 T     16.4  56    2021-11-23 15:48:20 MF B…   3024      NA ""           NA    184.  296. ""     N 32.1… W 08… Yes    ""      3/1/21  1/14/22
3 3024     11/23/2021 15:4… 0.24… 39100 P     10.0  63    2021-11-23 15:48:30 MF B…   3024      NA ""           NA    184.  296. ""     N 32.1… W 08… Yes    ""      3/1/21  1/14/22
4 3024     11/23/2021 15:4… 0.24… 39100 T     16.4  159   2021-11-23 15:49:00 MF B…   3024      NA ""           NA    184.  296. ""     N 32.1… W 08… Yes    ""      3/1/21  1/14/22
5 3024     11/23/2021 15:4… 0.76… 39000 P     7.0   561   2021-11-23 15:49:12 MF B…   3024      NA ""           NA    184.  296. ""     N 32.1… W 08… Yes    ""      3/1/21  1/14/22
6 3024     11/23/2021 15:4… 0.24… 39100 P     12.0  472   2021-11-23 15:49:30 MF B…   3024      NA ""           NA    184.  296. ""     N 32.1… W 08… Yes    ""      3/1/21  1/14/22

我修改了在另一篇文章中找到的代码,但无法使其工作。我得到的错误消息是!“vec”必须按非递减方式排序,并且不包含NA。任何帮助将不胜感激。

date2season <- function(date) {
  season_start <- c("09-01-2021", "12-01-2021", "03-01-2022", "06-01-2022", "09-01-2022", "12-01-2022", "03-01-2023", "06-01-2023", "09-01-2023") # mmdd
  season_name <- c("FALL ('21)", "WINTER ('21)", "SPRING ('22)", "SUMMER ('22)", "FALL ('22)", "WINTER ('22)", "SPRING ('23)", "SUMMER ('23)", "FALL ('23)")
  mmddyyy <- format(date, "%m%d%Y")
  season_name[findInterval(mmddyyy, season_start)] ##
}

dat2 <- dat
dat2 <- dat2[order(as.Date(dat2$Date, format="%m/%d/%Y")),] # sorting by date, so that manual data now is with the passive data

class(dat2$Date) # checking that Date was converted to Date format
dat2 <- dat %>% mutate(sxy = date2season(as.Date(Date, "%m/%d/%Y")))

error:
> dat2 <- dat %>% mutate(sxy = date2season(as.Date(Date, "%m/%d/%Y")))
Error in `mutate()`:
! Problem while computing `sxy = date2season(as.Date(Date, "%m/%d/%Y"))`.
Caused by error in `findInterval()`:
! 'vec' must be sorted non-decreasingly and not contain NAs
Backtrace:
 1. dat %>% mutate(sxy = date2season(as.Date(Date, "%m/%d/%Y")))
 7. global date2season(as.Date(Date, "%m/%d/%Y"))
 8. base::findInterval(mmddyyy, season_start)
 9. base::stop("'vec' must be sorted non-decreasingly and not contain NAs")
wlwcrazw

wlwcrazw1#

您的date2season函数存在一些不一致/问题:

  1. season_start"%m-%d-%Y"中,而您将date参数格式化为"%m%d%Y"(无破折号);这还不是最大的问题,所以我们暂时忽略它。
    1.你的间隔测试是在 * 字符串 * 上进行的,而不是在日期上,所以它会慢得多,效率也低得多。同样,这不是导致错误的原因,但这确实不是应该做的事情。
    1.因为你有"09"-"12"-"03"-,…在你的season_start中,无论其他什么,that 都在减少(即使是字符串)。findInterval要求其vec=(第二个)参数为
vec: numeric, sorted (weakly) increasingly, ...

我建议你把你的数据和季节放到一个合适的Date类中,比如:

date2season <- function(date) {
  stopifnot(inherits(date, "Date"))
  season_start <- c("09-01-2021", "12-01-2021", "03-01-2022", "06-01-2022", "09-01-2022", "12-01-2022", "03-01-2023", "06-01-2023", "09-01-2023") # mmdd
  season_start <- as.Date(season_start, format = "%m-%d-%Y")
  season_name <- c("FALL ('21)", "WINTER ('21)", "SPRING ('22)", "SUMMER ('22)", "FALL ('22)", "WINTER ('22)", "SPRING ('23)", "SUMMER ('23)", "FALL ('23)")
  season_name[findInterval(date, season_start)] ##
}

快速测试:

dates <- as.Date(c("10-01-2021", "12-15-2021", "03-10-2022", "06-22-2022"), format="%m-%d-%Y")
dates
# [1] "2021-10-01" "2021-12-15" "2022-03-10" "2022-06-22"
date2season(dates)
# [1] "FALL ('21)"   "WINTER ('21)" "SPRING ('22)" "SUMMER ('22)"

相关问题