R语言 一个函数的奇怪行为,它需要一个DF列

7rtdyuoh  于 2023-03-27  发布在  其他
关注(0)|答案(1)|浏览(117)

我有以下DF:

id<-c("id1","id2","id3","id4","id5","id6")
out<-c("50","60","60 4d", "60.4","5823",NA)
cov<-c("Male","male","mále","Fe male","female","fema")
dat<-data.frame(id,out,cov)

我创建了两个函数来帮助我组织和清理我的df:

conv_number<-function(data,variable){
  data<- data |> dplyr::mutate(variable = gsub(pattern = ",", replacement = ".", variable))
  x<- data |> dplyr::mutate(variable = as.numeric(gsub("[^0-9.-]", "", variable)))
  return (x)
}

clean_string<-function(data,variable){
  data |> dplyr::mutate(variable = tolower(variable))
  x<- data |> dplyr::mutate(variable = gsub("[^a-z]", "", variable))
  return (x)
}

我使用这些函数的意图是,它们获取数据集的一列,并在同一列中进行一些转换。因此,我这样使用它们:

prueba_1<-conv_number(dat,out)
prueba_1<-clean_string(dat,cov)

然而,这不是他们所做的,他们创建了一个名为“变量”的新列。当然,在第二个示例中,变量不会将字符转换为tolower。
我在这里错过了什么?也许dplyr::mutate()函数有问题?

avwztpqn

avwztpqn1#

你可以阅读更多关于quasiquotation的内容来更好地理解如何做到这一点,但这里有一个使用curly-curly大括号的选项:

library(dplyr)

id<-c("id1","id2","id3","id4","id5","id6")
out<-c("50","60","60 4d", "60.4","5823",NA)
cov<-c("Male","male","mále","Fe male","female","fema")
dat<-data.frame(id,out,cov)

conv_number<-function(data,variable){
  data<- data |> 
    dplyr::mutate({{variable}} := gsub(pattern = ",", replacement = ".", {{variable}}))
  
  x <- data |> 
    dplyr::mutate({{variable}} := as.numeric(gsub("[^0-9.-]", "", {{variable}})))
  
  return (x)
}

clean_string<-function(data,variable){
  
  data <- data |> 
    dplyr::mutate({{variable}} := tolower({{variable}}))
  
  x <- data |> 
    dplyr::mutate({{variable}} := gsub("[^a-z]", "", {{variable}}))
  
  return (x)
}

conv_number(dat,out)
#>    id    out     cov
#> 1 id1   50.0    Male
#> 2 id2   60.0    male
#> 3 id3  604.0    mále
#> 4 id4   60.4 Fe male
#> 5 id5 5823.0  female
#> 6 id6     NA    fema

clean_string(dat,cov)
#>    id   out    cov
#> 1 id1    50   male
#> 2 id2    60   male
#> 3 id3 60 4d    mle
#> 4 id4  60.4 female
#> 5 id5  5823 female
#> 6 id6  <NA>   fema

相关问题