regex 正则表达式仅查找单词和dplyr中的特殊字符/数字/点

bxfogqkk 于 2023-04-07 发布在其他

关注(0)|答案(1)|浏览(112)

我需要找到文本中任何地方包含术语info的行
1.前后都没有字符
1.后面带点或任何特殊字符
1.后接一个或多个数字
这是一个数据的快照，可以帮助

df_new <- data.frame(
  text=c('info is given','he is given info. in the class',
               'she needs info2','why not having information',
               'his info# missing', 'info12 and packages are given',
               'parainfo is ready','info. was awarded',
               'meeting is with .info'))

> df_new
                            text
1                  info is given
2 he is given info. in the class
3                she needs info2
4     why not having information
5              his info# missing
6  info12 and packages are given
7              parainfo is ready
8              info. was awarded
9           meeting is with .info

我正在使用这段代码，但它并没有捕获我需要的所有内容：

df_new %>%
  mutate(text=tolower(text)) %>%
  mutate(string_detected = as.integer(str_detect(text, "(^|\\s)info(\\s|$)")))

因此，感兴趣的结果是：

text             strings_detected
                  info is given               1
 he is given info. in the class               1   
                she needs info2               1
     why not having information               0
              his info# missing               1
  info12 and packages are given               1
              parainfo is ready               0
              info. was awarded               1 
           meeting is with .info              0

非常感谢！

regex

来源：https://stackoverflow.com/questions/75943645/regex-to-find-a-word-only-and-following-special-characters-numbers-dot-in-dplyr