提取R中括号内的字符串

q3qa4bjr  于 2023-01-06  发布在  其他
关注(0)|答案(2)|浏览(146)

我需要从一个字符串中提取一些信息。
以下是一个示例数据集

data <- data.frame(id = c(1,2),
                  text = c("GK_Conciencia fonologica (FSS)_Form_Number_1.csv",
                           "G1_Conciencia fonologica (FSL)_Form_Number_3.csv"))

> data
  id                                             text
1  1 GK_Conciencia fonologica (FSS)_Form_Number_1.csv
2  2 G1_Conciencia fonologica (FSL)_Form_Number_3.csv

基本上,我需要提取括号内的文本和Form_Number后的数值。
我怎样才能获得下面所需的信息。

id                                             text  cat form
1  1 GK_Conciencia fonologica (FSS)_Form_Number_1.csv. FSS. 1
2  2 G1_Conciencia fonologica (FSL)_Form_Number_3.csv. FSl. 3
a6b3iqyw

a6b3iqyw1#

使用gsub的解决方案

library(dplyr)

data %>% 
  mutate(cat = gsub(".*\\(|\\).*", "", text), 
         form = gsub(".*Form_Number_|\\.csv$", "", text))
  id                                             text cat form
1  1 GK_Conciencia fonologica (FSS)_Form_Number_1.csv FSS    1
2  2 G1_Conciencia fonologica (FSL)_Form_Number_3.csv FSL    3
jrcvhitl

jrcvhitl2#

使用str_extract

library(dplyr)
library(stringr)
data %>% 
    mutate(cat = str_extract(text, "\\(([^)]+)", group = 1),
    form = as.integer(str_extract(text, "Number_(\\d+)", group = 1)))
  • 输出
id                                             text cat form
1  1 GK_Conciencia fonologica (FSS)_Form_Number_1.csv FSS    1
2  2 G1_Conciencia fonologica (FSL)_Form_Number_3.csv FSL    3

或使用extract

library(tidyr)
 extract(data, text, into = c("cat", "form"), 
    ".*\\(([^)]+).*_Number_(\\d+)\\..*", remove = FALSE, 
    convert = TRUE)
  id                                             text cat form
1  1 GK_Conciencia fonologica (FSS)_Form_Number_1.csv FSS    1
2  2 G1_Conciencia fonologica (FSL)_Form_Number_3.csv FSL    3

或者使用base R

cbind(data, strcapture(".*\\(([^)]+)\\)_Form_Number_(\\d+)\\..*", 
   data$text, data.frame(cat =character(), form = integer() )))
  id                                             text cat form
1  1 GK_Conciencia fonologica (FSS)_Form_Number_1.csv FSS    1
2  2 G1_Conciencia fonologica (FSL)_Form_Number_3.csv FSL    3

相关问题