如何在R中插入逗号到预先存在的字符串中?

2lpgd968  于 8个月前  发布在  其他
关注(0)|答案(2)|浏览(39)

我的数据看起来类似于下面的例子:
| 数据|
| --|
| 假装县JP|
| FAKE COUNTY,TX JP 1.1|
| Madeup City,TX|
| Not真实的County,JP 2.5|
如何将“,TX“添加到没有它的县,如第一个细胞?我想有一个数据集,然后看起来像:
| 数据|
| --|
| Pretend County,TX JP|
| FAKE COUNTY,TX JP 1.1|
| Madeup City,TX|
| Not真实的County,TX JP 2.5|

5jdjgkvh

5jdjgkvh1#

我不确定确切的要求,但你可以尝试使用正则表达式。

# build example data
df <-
  data.frame(
    stringsAsFactors = FALSE,
                DATA = c("PRETEND COUNTY JP",
                         "FAKE COUNTY,TX JP 1.1","Madeup City,TX",
                         "Not Real County, JP 2.5")
  )

# build regular expression
pattern <- stringr::regex('county', ignore_case = TRUE)

# use regular expression to make new desired column
df2 <-
  df |> 
    dplyr::mutate(
      DATA2 = 
        dplyr::case_when(
          stringr::str_detect(DATA, "TX") ~ DATA,
          TRUE ~ stringr::str_replace(DATA, pattern ,"County, TX")
        )
    )

df2
#>                      DATA                       DATA2
#> 1       PRETEND COUNTY JP       PRETEND County, TX JP
#> 2   FAKE COUNTY,TX JP 1.1       FAKE COUNTY,TX JP 1.1
#> 3          Madeup City,TX              Madeup City,TX
#> 4 Not Real County, JP 2.5 Not Real County, TX, JP 2.5

创建于2023-09-22使用reprex v2.0.2

zdwk9cvp

zdwk9cvp2#

library(tidyverse)
df %>%
  mutate(address = str_replace(address, "(?i)(?<=(COUNTY|CITY)),?\\s(?=JP)", ", TX "))
                     address
1      PRETEND COUNTY, TX JP
2      FAKE COUNTY,TX JP 1.1
3             Madeup City,TX
4 Not Real County, TX JP 2.5

如何工作:

  1. (?i):不区分大小写标志
  2. (?<=(COUNTY|CITY)):正后看:仅当您看到匹配项左侧的“县”或“市”时才匹配
  3. ,?\\s:匹配可选逗号和空格
  4. (?=JP):但仅当匹配项右侧有“JP”时

相关问题