regex 如何在tidyverse管道中的dplyr：：mutate中使用stringr：：str_match_all

lb3vh1jj 于 2023-04-22 发布在其他

关注(0)|答案(1)|浏览(123)

使用stringr：：str_match，我可以为每行中的第一个“H45”示例创建一个包含“H45”之后的字符的列。

library(dplyr)
library(stringr)

df <- tibble::tibble(A = c("H459 A452 H4544", "A452", "H4535"))

df <- df %>% mutate(H45_value = 
           str_match(A, 'H45([[0-9]]{1,2})') %>% 
           .[,2])

我想使用stringr：：str_match_all创建一个列，它包含每行中“H45”* 每 * 次出现后的字符。然而，我无法让str_match_all在tidyverse管道中运行。我想这是因为我不知道在管道中调用[[1][，2]的正确语法。
它作为一个独立的代码行工作：
str_match_all("H459 A452 H4544", 'H45([[0-9]]{1,2})')[[1]][,2]
我希望输出像这样，其中“H45_value”的第一个值是一个列表或类似的：
| A|H45_值|
| --------------|--------------|
| H459 A452 H4544|九、四十四|
| A452|不适用|
| H4535|三十五|

regex

来源：https://stackoverflow.com/questions/76049930/how-to-use-stringrstr-match-all-inside-dplyrmutate-in-the-tidyverse-pipe

1条答案

按热度按时间

8yparm6h1#

str_extract_all()是一个更好的函数选择，因为它默认返回一个提取值的列表，而不是str_match_all()返回的矩阵。所以你可以这样做：

library(dplyr)
library(stringr)

df %>%
  mutate(H45_value = str_extract_all(A, "(?<=H45)\\d+"))

# A tibble: 3 × 2
  A               H45_value
  <chr>           <list>   
1 H459 A452 H4544 <chr [2]>
2 A452            <chr [0]>
3 H4535           <chr [1]>

其中H45_value包含：

[[1]]
[1] "9"  "44"

[[2]]
character(0)

[[3]]
[1] "35"

如果你想使用str_match_all()，你需要迭代结果并提取第二列：

df %>%
  mutate(H45_value = lapply(str_match_all(A, 'H45([[0-9]]{1,2})'), `[`, , 2))

赞(0）回复(0）举报 2023-04-22

我来回答

regex 如何在tidyverse管道中的dplyr：：mutate中使用stringr：：str_match_all

1条答案

相关问题

热门标签

最新问答