R语言 如何提取只包含特定单词的字符串?

fumotvh3  于 2023-03-27  发布在  其他
关注(0)|答案(2)|浏览(226)

我试图只提取包含单词“bunt”的描述。我不确定如何做到这一点,因为这个词在我的大型数据集中的位置各不相同。下面是我正在使用的示例:

data <-data.frame(des=c("Austin Hedges bunt single", "Francisco Lindor Homerun", 
                           "Pete Alonso out on a ground ball bunt", 
                           "Yonathan Daza sac bunt ground out to shortstop", "Jose 
                           Ramirez pop up to second baseman"))

任何帮助我都感激不尽。谢谢。

7gs2gvoe

7gs2gvoe1#

我们可以使用grepl来查找单词,并在base R中查找subset

subset(data, grepl("\\bbunt\\b", des))
  • 输出
des
1                      Austin Hedges bunt single
3          Pete Alonso out on a ground ball bunt
4 Yonathan Daza sac bunt ground out to shortstop
zzzyeukh

zzzyeukh2#

使用dplyrstringr包,这里有一个解决方案。

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(stringr)

data <-data.frame(des=c("Austin Hedges bunt single", "Francisco Lindor Homerun", 
                        "Pete Alonso out on a ground ball bunt", 
                        "Yonathan Daza sac bunt ground out to shortstop",
                        "Jose Ramirez pop up to second baseman"))

data %>%
  filter(stringr::str_detect(des, "bunt"))
#>                                              des
#> 1                      Austin Hedges bunt single
#> 2          Pete Alonso out on a ground ball bunt
#> 3 Yonathan Daza sac bunt ground out to shortstop

创建于2023年3月24日,使用reprex v2.0.2

相关问题