如何使用R抽取模式的所有匹配项并合并不同的匹配项？

rjee0c15 于 2023-01-28 发布在其他

关注(0)|答案(2)|浏览(121)

我希望使用regex模式从字符串中提取所有匹配项，然后仅将distinct匹配项组合到单个字符串中。
我想提取单词films之前的所有单词，然后只合并distinct单词。我尝试使用以下脚本，它组合了所有匹配项：

text1 <- "Netflix announced 34 new Korean films to hit the streaming platform in 2023, along with 12 Japanese films. The upcoming titles, which Netflix calls their “biggest-ever lineup of Korean films and series."

pattern <- "\\b[[:alpha:]]+\\b(?=\\sfilms)"

map_chr(str_extract_all(text1, pattern), paste, collapse = " | ")

> 'Korean | Japanese | Korean'

预期输出：

'Korean | Japanese'

来源：https://stackoverflow.com/questions/75165264/how-to-extract-all-matches-of-pattern-and-combine-distinct-matches-using-r

2条答案

按热度按时间

5rgfhyps1#

试试这个

text1 <- "Netflix announced 34 new Korean films to hit the streaming platform in 2023, along with 12 Japanese films. The upcoming titles, which Netflix calls their “biggest-ever lineup of Korean films and series."

pattern <- "\\b[[:alpha:]]+\\b(?=\\sfilms)"

paste(unique((str_extract_all(text1, pattern)[[1]])), collapse = " | ")

我们得到

"Korean | Japanese"

赞(0）回复(0）举报 2023-01-28

fruv7luv2#

请按照下面的代码取消列出，然后考虑独特的元素

text1 <- "Netflix announced 34 new Korean films to hit the streaming platform in 2023, along with 12 Japanese films. The upcoming titles, which Netflix calls their “biggest-ever lineup of Korean films and series."

pattern <- "\\b[[:alpha:]]+\\b(?=\\sfilms)"

map_chr(unique(unlist(str_extract_all(text1, pattern))), paste, collapse = " | ")

赞(0）回复(0）举报 2023-01-28

我来回答

如何使用R抽取模式的所有匹配项并合并不同的匹配项？

2条答案

相关问题

热门标签

最新问答