R if else循环的问题:仅部分执行的条件

5ssjco0h  于 2023-01-10  发布在  其他
关注(0)|答案(3)|浏览(114)

我有以下数据框:

Row    Repro Number2
1      1     EWC
2     NA     LWY
3      7     EWS
4     NA     LWC
5     NA     EWC
6     NA     LWC
7      3     EWY
8     NA    LW2Y
9     NA Unknown
10    NA     LWC
11     1     EWC
12    NA     LWY
13    NA     EWY
14    NA     LWY
15    NA Unknown
16    NA     LWC

在这个数据框中,我使用了以下循环:

for (i in 1:nrow(df3)) {
  if(df3$Number2[i+1]=="Unknown" & is.na(df3$Repro[i])) {
    df3$Number2[i]="Unknown"
  } else{
    df3$Number2[i]==df3$Number2[i]
  }
}

当循环运行时,我在最后得到一个错误代码, Dataframe 看起来不像我想要的结果。
我的问题是,当代码执行其预期目的时(如果number2列后面的值也是"未知",并且关联的Repro值是NA,则用"未知"替换该列中的值),它只对数据流中最初的"未知"值执行此操作。我希望它也考虑到添加的新"未知",并使用这些值执行循环条件。
下面是错误代码:

Error in if (df3$Number2[i + 1] == "Unknown" & is.na(df3$Repro[i])) { : 
  missing value where TRUE/FALSE needed

这是运行循环后的 Dataframe 。我添加了另一个名为"Number2.Correct"的列,显示我希望Number2列实际上是什么样子。问题是第12行和第13行-它们应该是"Unknows",而不是"LWY"和"EWY"。

Repro Number2  Number2.Correct
1      1     EWC  EWC
2     NA     LWY  LWY
3      7     EWS  EWS
4     NA     LWC  LWC
5     NA     EWC  EWC
6     NA     LWC  LWC
7      3     EWY  EWY
8     NA Unknown  Unknown
9     NA Unknown  Unknown
10    NA     LWC  LWC
11     1     EWC  EWC
12    NA     LWY  Unknown
13    NA     EWY  Unknown 
14    NA Unknown  Unknown
15    NA Unknown  Unknown
16    NA     LWC  LEW

最后,我有两个问题:
1.如何更改代码以获得所需的结果?
1.为什么会出现错误代码?它是否是导致此问题的部分原因?

4szc88ey

4szc88ey1#

for (i in rev(1:nrow(df3))) {
  if (df3$Number2[i + 1] == "Unknown" & is.na(df3$Repro[i]) & i + 1 < nrow(df3)) {
    df3$Number2[i] <- "Unknown"
  } else {
    df3$Number2[i] == df3$Number2[i]
  }
}

df3
#>    Row Repro Number2
#> 1    1     1     EWC
#> 2    2    NA     LWY
#> 3    3     7     EWS
#> 4    4    NA     LWC
#> 5    5    NA     EWC
#> 6    6    NA     LWC
#> 7    7     3     EWY
#> 8    8    NA Unknown
#> 9    9    NA Unknown
#> 10  10    NA     LWC
#> 11  11     1     EWC
#> 12  12    NA Unknown
#> 13  13    NA Unknown
#> 14  14    NA Unknown
#> 15  15    NA Unknown
#> 16  16    NA     LWC

创建于2023-01-09,使用reprex v2.0.2您遇到两个问题:

  1. i + 1超出数据中最后一行的范围;我添加了另一个条件(i + 1 < nrow(df3)
    1.您发布的输出结果表明,您希望从下到上而不是从上到下查找Unknown
piv4azn7

piv4azn72#

代码失败的原因是nrow(df3)+1超出范围,因此for循环需要为1:(nrow(df3)-1)
要迭代更新Number2,一个简单的方法(虽然不太优雅)是使用while循环,停止条件是新旧Number2相同。

while(T){
  df3$Number2_new <- df3$Number2
  for (i in 1:(nrow(df3)-1)) {
    if(df3$Number2_new[i+1]=="Unknown" & is.na(df3$Repro[i])) {
      df3$Number2_new[i]="Unknown"
    } else{
      df3$Number2_new[i]==df3$Number2_new[i]
    }
  }
  
  if(all(df3$Number2==df3$Number2_new)){
    df3 <- df3%>%
      mutate(Number2=Number2_new)%>%
      select(-Number2_new)
    break
  }else{
    df3 <- df3%>%
      mutate(Number2=Number2_new)%>%
      select(-Number2_new)
  }
}

df3

   Row Repro Number2
1    1     1     EWC
2    2    NA     LWY
3    3     7     EWS
4    4    NA     LWC
5    5    NA     EWC
6    6    NA     LWC
7    7     3     EWY
8    8    NA Unknown
9    9    NA Unknown
10  10    NA     LWC
11  11     1     EWC
12  12    NA Unknown
13  13    NA Unknown
14  14    NA Unknown
15  15    NA Unknown
16  16    NA     LWC
py49o6xq

py49o6xq3#

i+1在数据的nrow之后超出范围。我们可以对tidyverse使用分组方法

library(dplyr)
library(tidyr)
library(data.table)
 df3 %>%
  mutate(grp = replace(replace(Number2, Number2 != "Unknown", NA), 
    Number2 == "Unknown", seq_len(sum(Number2 == "Unknown")))) %>% 
  fill(grp, .direction = "updown") %>%
  group_by(grp, grp2 = rleid(is.na(Repro))) %>%
  mutate(Number2 = case_when(is.na(Repro) & 
    row_number() < match("Unknown", Number2) ~ "Unknown",
    TRUE ~ Number2)) %>%
  ungroup %>%
  select(-grp, -grp2)
  • 输出
# A tibble: 16 × 3
     Row Repro Number2
   <int> <int> <chr>  
 1     1     1 EWC    
 2     2    NA LWY    
 3     3     7 EWS    
 4     4    NA LWC    
 5     5    NA EWC    
 6     6    NA LWC    
 7     7     3 EWY    
 8     8    NA Unknown
 9     9    NA Unknown
10    10    NA LWC    
11    11     1 EWC    
12    12    NA Unknown
13    13    NA Unknown
14    14    NA Unknown
15    15    NA Unknown
16    16    NA LWC

数据

df3 <- structure(list(Row = 1:16, Repro = c(1L, NA, 7L, NA, NA, NA, 
3L, NA, NA, NA, 1L, NA, NA, NA, NA, NA), Number2 = c("EWC", "LWY", 
"EWS", "LWC", "EWC", "LWC", "EWY", "LW2Y", "Unknown", "LWC", 
"EWC", "LWY", "EWY", "LWY", "Unknown", "LWC")),
 class = "data.frame", row.names = c(NA, 
-16L))

相关问题