在函数中组合for循环和mutate的case_match

soat7uwm  于 12个月前  发布在  其他
关注(0)|答案(1)|浏览(72)

我尝试使用dplyr包中的mutatecase_match创建一个函数,作为我的分析工作流的一部分。然而,为了完全自动化这个过程,我想包括一个额外的框架(作为函数中的参数),它包含文本字符串对,如果在包含数据的框架中找到这些字符串对,它们将被更改。
如果没有for循环,这是完美的:

dftest <- data.frame(old = c("ones","twos","fours","fives"), new = c("Humanoid",
"Hairy","Hairy","what"))
test1 <- data.frame(spp= c("ones", "twos", "threes"), log = c(5,61,36))

updnames <- function(df, col, names_df) {
    require(dplyr)
    if(ncol(names_df)>2)
    {stop("More than 2 columns in names dataframe")}
    if(sum(duplicated(names_df[1]))>0)
    {stop("Duplicate old species names")}
    else
    {
        names_df <- names_df %>% mutate_all(as.character)
        df <- df %>%
                mutate(updnames = case_match({{col}},
                    names_df[1,1] ~ names_df[1,2],
                    names_df[2,1] ~ names_df[2,2],
                    names_df[3,1] ~ names_df[3,2],
                    names_df[4,1] ~ names_df[4,2],
                    .default = {{col}}))}
    return(df)
}

test2 <- updnames(test1, spp,dftest)

> test2 # Correct output
     spp log updnames
1   ones   5 Humanoid
2   twos  61    Hairy
3 threes  36   threes

字符串
但是添加for循环并不起作用。新列按预期创建,但列值只是复制:

updnames <- function(df, col, names_df) {
  require(dplyr)
  if(ncol(names_df)>2)
  {stop("More than 2 columns in names dataframe")}
  if(sum(duplicated(names_df[1]))>0)
  {stop("Duplicate old species names")}
  else
  {
  names_df <- names_df %>% mutate_all(as.character)
  for(i in 1:nrow(names_df)){
    df <- df %>%
  mutate(updnames = case_match({{col}},
      names_df[i,1] ~ names_df[i,2],
      .default = {{col}}))}
  }
  return(df)
}

test2 <- updnames(test1, spp, dftest)

> test2 # Wrong output
     spp log updnames
1   ones   5     ones
2   twos  61     twos
3 threes  36   threes


我试着在Stack Overflow上查看其他各种帖子,并阅读相关文档,但我似乎无法弄清楚。
如果任何人有任何想法,或替代解决方案,我试图实现,这将是非常感谢。

2nbm6dog

2nbm6dog1#

请使用recode

test1 %>% 
  mutate(new_spp = recode(spp, !!!deframe(dftest)))

     spp log  new_spp
1   ones   5 Humanoid
2   twos  61    Hairy
3 threes  36   threes

字符串
在函数格式中做:

update_names <- function(df, col, new_names){
     df %>% mutate('{{col}}_new' := recode({{col}},!!!deframe(new_names)))
 }
update_names(test1,spp, dftest)
     spp log  spp_new
1   ones   5 Humanoid
2   twos  61    Hairy
3 threes  36   threes

相关问题