regex 如何将列值与分隔符以及分隔符的例外组合在一起?

yuvru6vn  于 2022-12-14  发布在  其他
关注(0)|答案(2)|浏览(111)

我有以下数据框:

fruit <- c("apple", "orange", "peach", "")
color <- c("red", "orange", "", "purple")
taste <- c("sweet", "", "sweet", "neutral")

df <- data.frame(fruit, color, taste)

我想将所有列一起添加到一个名为“combined”的列中:

combined <- c("apple + red + sweet", "orange + orange", "peach + sweet", "purple + neutral")

因此,我有以下数据框:

df2 <- data.frame(fruit, color, taste, combined)

我尝试使用regex:

df %>%
      unite("combined",
            fruit,
            color,
            taste,  
            sep=" + ",
            remove = FALSE)

我一直试图删除“+”,当它在开始或结束,或者如果有一个空白之前,它使用以下正则表达式,但它感觉草率,似乎没有达到正是我想要的:

df %>%
  as_tibble() %>%
  mutate(across(any_of(combined), gsub, pattern = "^\\+|\\+  \\+  \\+  \\+|\\+  \\+  \\+|\\+  \\+|\\+$", replacement = "")) %>%
  mutate_if(is.character, trimws)

任何指导将不胜感激!谢谢!

o7jaxewo

o7jaxewo1#

我们可以用NA替换空白(""),然后在unite中使用na.rm = TRUE

library(dplyr)
library(tidyr)
df %>%
  mutate(across(everything(), ~ na_if(.x,  ""))) %>%
  unite(combined, everything(), sep = " + ", na.rm = TRUE, 
     remove = FALSE)
  • 输出
combined  fruit  color   taste
1 apple + red + sweet  apple    red   sweet
2     orange + orange orange orange    <NA>
3       peach + sweet  peach   <NA>   sweet
4    purple + neutral   <NA> purple neutral
2izufjch

2izufjch2#

创建一个接受两个字符串并生成其和的函数,然后使用Reduce应用该函数。

library(dplyr)

Paste <- function(x, y) paste0(x, ifelse(x == "" | y == "", "", " + "), y)
df %>% mutate(combined = Reduce(Paste, .))

给予

fruit  color   taste            combined
1  apple    red   sweet apple + red + sweet
2 orange orange             orange + orange
3  peach          sweet       peach + sweet
4        purple neutral    purple + neutral

相关问题