向后填充R数据表中NA的最后一个示例

dldeef67  于 2023-06-27  发布在  其他
关注(0)|答案(3)|浏览(79)

我有一个data.table,其中的列如下:
c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12)
我想通过向后携带下一个非NA值,只填充列中每个非NA值之前的两个NA。结果应为:
c(58,NA,NA,13,13,13,NA,12,12,12,23,12,12)
我设法做到了:

dt = data.table(V1 = c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12))
dt[, rleid:=rleid(dt$V1)]
dt[, num := seq(.N), rleid]

u=1
arr = c()
for (i in 1:(nrow(dt)-1)){
  if(dt$rleid[i] == dt$rleid[i+1]){
    u=u+1
    next
  }
  else{
    arr = append(arr,u)}
  u=1
}
arr=append(arr,1)

v=c()
for (i in 1:(length(arr))){
  for (j in 1:arr[i]){
    v=append(v,arr[i])
  }
}

dt[, len:=v]
dt[, val:=len-num]
dt[, V2 := fifelse(is.na(V1) & val<=1, nafill(V1, "nocb"), V1)]

此解决方案对于大数据表来说耗时太长。有没有更快的建议?

x9ybnkn6

x9ybnkn61#

一个快速而肮脏的data.table解决方案:

dt[, V1b := fcoalesce(c(list(V1), shift(V1, -(1:2))))]

# Or simply (as suggested by B. Christian Kamgang)
dt[, V1b := fcoalesce(shift(V1, -(0:2)))]

      V1   V1b
    <num> <num>
 1:    58    58
 2:    NA    NA
 3:    NA    NA
 4:    NA    13
 5:    NA    13
 6:    13    13
 7:    NA    NA
 8:    NA    12
 9:    NA    12
10:    12    12
11:    23    23
12:    NA    12
13:    12    12
kfgdxczn

kfgdxczn2#

您可以通过两次反转向量来修改给定的函数here,即

na_locf_max_backwards <- function(x, nmax){
  x <- rev(x)
  s <- split(x, cumsum(!is.na(x)))
  l <- mapply(\(x, y) {
        x[1:nmax+1] <- x[1]
        length(x) <- y
        x
      }, s, lengths(s))
  x <- unlist(l, use.names = FALSE)
  x <- rev(x)
  x
}

na_locf_max_backwards(c(58,NA,NA,NA,NA,13,NA,NA,NA,12,23,NA,12), nmax = 2)
# [1] 58 NA NA 13 13 13 NA 12 12 12 23 12 12
pvcm50d1

pvcm50d13#

另一个data.table解决方案:

dt[, V2 := fifelse(rev(rowid(rev(rleid(V1))))<=2, nafill(V1, "nocb"), V1)]

       V1    V2
 1:    58    58
 2:    NA    NA
 3:    NA    NA
 4:    NA    13
 5:    NA    13
 6:    13    13
 7:    NA    NA
 8:    NA    12
 9:    NA    12
10:    12    12
11:    23    23
12:    NA    12
13:    12    12

相关问题