R语言 当变量的拼写方式稍有不同时,我该如何计数?

jjhzyzn0  于 2022-12-20  发布在  其他
关注(0)|答案(2)|浏览(91)

我在R中计数时遇到了问题。每个变量的拼写略有不同,如下所示

df<-data.frame(sweets= c("cookie", "CANDY", "Cookie", "cake", "IceCream", "Candy", "Chocolate", "COOKIE", "CAKE"))
df

我希望能够这样做。要做到这一点,我希望更改每个变量名以保持一致

df2<-data.frame(sweets= c("Cookie", "Candy", "Cookie", "Cake", "IceCream", "Candy", "Chocolate", "Cookie", "Cake"))               
df3<- table(df2)

我用了if_else或者if...if else函数,但是很混乱。如果你能写一个示例代码来说明怎么做,那就太好了。

xienkqul

xienkqul1#

mutate中使用stringr中的str_to_title,你可以 "convert case" 你的变量,然后你可以使用count来计算每个甜点的观察次数。

代码

library(dplyr)
library(stringr)
   

df <- data.frame(sweets= c("cookie", "CANDY", "Cookie", "cake", "IceCream", "Candy", "Chocolate", "COOKIE", "CAKE"))

df %>% 
  mutate(sweets = str_to_title(sweets)) %>%
  count(sweets)

输出

sweets n
1      Cake 2
2     Candy 2
3 Chocolate 1
4    Cookie 3
5  Icecream 1
zbsbpyhn

zbsbpyhn2#

全部转换为小写,然后转换为 table

table(tolower(df$sweets))
# cake     candy chocolate    cookie  icecream 
#    2         2         1         3         1

或者?tolower提供了一个辅助函数- capwords

capwords <- function(s, strict = FALSE) {
  cap <- function(s) paste(toupper(substring(s, 1, 1)),
                           {s <- substring(s, 2); if(strict) tolower(s) else s},
                           sep = "", collapse = " " )
  sapply(strsplit(s, split = " "), cap, USE.NAMES = !is.null(names(s)))
}

table(capwords(df$sweets, strict = TRUE))
# Cake     Candy Chocolate    Cookie  Icecream 
#    2         2         1         3         1

相关问题