如何提取R中一组符号之间的字符串并将其存储在向量中

tmb3ates  于 2023-02-01  发布在  其他
关注(0)|答案(1)|浏览(109)
mystring <- "\n\n-Acanthosis nigricans\n-Hyperpigmentation\n-Hyperkeratosis\n-Skin fold regions\n-Neck\n-Groin\n-Axillae\n-Obesity \n-Drug-induced AN\n-Malignant AN"

我想提取\n-\n之间的项并将其存储为向量:

> mystring_extracted

 [1] "Acanthosis nigricans" "Hyperpigmentation"    "Hyperkeratosis"       "Skin fold regions"   
 [5] "Neck"                 "Groin"                "Axillae"              "Obesity"             
 [9] "Drug-induced AN"      "Malignant AN"

我尝试了以下方法,但没有达到我的要求:

> gsub("\n-", "", mystring)
[1] "\nAcanthosis nigricansHyperpigmentationHyperkeratosisSkin fold regionsNeckGroinAxillaeObesity Drug-induced ANMalignant AN"
bvjxkvbb

bvjxkvbb1#

使用strsplit。它将返回一个列表,在本例中包含一个几乎是所需字符向量的组件,因此使用1获得该组件,然后删除垃圾元素。

strsplit(mystring, "\n-")[[1]][-1]

给出:

[1] "Acanthosis nigricans" "Hyperpigmentation"    "Hyperkeratosis"      
 [4] "Skin fold regions"    "Neck"                 "Groin"               
 [7] "Axillae"              "Obesity "             "Drug-induced AN"     
[10] "Malignant AN"

下面是它的一个变体,它首先在开始时删除垃圾,然后执行拆分并执行unlist以获得字符向量。

mystring |>
  trimws(whitespace = "[\n-]") |>
  strsplit("\n-") |>
  unlist()

相关问题