R语言将if/then条件与|>管道字符

xjreopfe 于 2023-01-15 发布在其他

关注(0)|答案(3)|浏览(197)

我需要提取几千个人的姓氏。这些名字的长度为两个或三个单词，取决于是否有后缀。我的攻击是计算每行中的单词数，然后根据单词数执行不同的separate()函数。下面的代码不起作用，但显示了我的想法：

customers = data.frame(names=c("Jack Quinn III", "David Powell", "Carrie Green",
           "Steven Miller, Jr.", "Christine Powers", "Amanda Ramirez"))

customers |> 
  mutate(names_count = str_count(names, "\\w+")) |>
  {
  if(names_count == 2,
     separate(name, c("first_name", "last_name") ),
     separate(name, c("first_name", "last_name", "suffix") )
  )
  }

这段代码不可能工作，因为我没有解释错误消息的能力，事实上，我不确定if语句中是否需要逗号，因为显然有函数同时使用这两个逗号。
我的想法是，我可以通过执行以下操作将名称拆分为列

df |> 
  mutate() to count words |> 
  separate() to split columns based on count

但我连最简单的if语句都无法运行。

来源：https://stackoverflow.com/questions/75112471/using-an-if-then-condition-with-the-pipe-character

3条答案

按热度按时间

bksxznpy1#

我们可以使用stringr中的word：

library(stringr)
library(dplyr)

customers |>
    mutate(last_name = word(names, 2))

输出：

names last_name
1     Jack Quinn III     Quinn
2       David Powell    Powell
3       Carrie Green     Green
4 Steven Miller, Jr.   Miller,
5   Christine Powers    Powers
6     Amanda Ramirez   Ramirez

赞(0）回复(0）举报 2023-01-15

abithluo2#

使用str_extract

library(dplyr)
library(stringr)
 customers %>%
   mutate(last_name = str_extract(names, "^[A-Za-z]+\\s+([A-Za-z]+)", group = 1))

输出

names last_name
1     Jack Quinn III     Quinn
2       David Powell    Powell
3       Carrie Green     Green
4 Steven Miller, Jr.    Miller
5   Christine Powers    Powers
6     Amanda Ramirez   Ramirez

赞(0）回复(0）举报 2023-01-15

x3naxklr3#

您可以删除if

customers %>% 
  separate(names, into = c("first_name", "last_name", "suffix"), sep=" ") %>% 
  select(last_name)

如果你想避免额外的软件包，你可以使用R base sub + regex：

> sub("[A-Za-z]+\\s+([A-Za-z]+)\\s?.*", "\\1", customers$names)
[1] "Quinn"   "Powell"  "Green"   "Miller"  "Powers"  "Ramirez"

赞(0）回复(0）举报 2023-01-15

我来回答

R语言将if/then条件与|>管道字符

3条答案

相关问题

热门标签

最新问答

R语言 将if/then条件与|>管道字符

3条答案

相关问题

热门标签

最新问答

R语言将if/then条件与|>管道字符