我想分隔一个字段使用tidyr：分离和保留分隔符和使用负回看

nfzehxib 于 2023-04-03 发布在其他

关注(0)|答案(1)|浏览(104)

我想使用separate，后面带否定的外观，并保留分隔符。我下面的解决方案不保留姓氏的第一个大写字母。
有一个不使用否定的答案，我不知道如何修改它的负面回顾。
How do I split a string with tidyr::separate in R and retain the values of the separator string?

tidyr::tibble(myname = c("HarlanNelson")) |>  
  tidyr::separate(col = myname, into = c("first", "last"), sep = "(?<!^)[[:upper:]]")
#> # A tibble: 1 × 2
#>   first  last 
#>   <chr>  <chr>
#> 1 Harlan elson

由reprex package（v2.0.1）于2022-10-20创建

tidyr::tibble(myname = c("HarlanNelson", "Another Person")) |>  
  tidyr::separate(col = myname, into = c("first", "last"), sep = c(" ", "(?<!^)[[:upper:]]"))
#> Warning in gregexpr(pattern, x, perl = TRUE): argument 'pattern' has length > 1
#> and only the first element will be used
#> Warning: Expected 2 pieces. Missing pieces filled with `NA` in 1 rows [1].
#> # A tibble: 2 × 2
#>   first        last  
#>   <chr>        <chr> 
#> 1 HarlanNelson <NA>  
#> 2 Another      Person

由reprex package（v2.0.1）于2022-10-20创建

tidyr::tibble(myname = c("HarlanNelson", "Another Person", "someone else")) |>  
  tidyr::separate(col = myname, into = c("first", "last"), sep = c(" ", "(?<!^)[[:upper:]]"))
#> Warning in gregexpr(pattern, x, perl = TRUE): argument 'pattern' has length > 1
#> and only the first element will be used
#> Warning: Expected 2 pieces. Missing pieces filled with `NA` in 1 rows [1].
#> # A tibble: 3 × 2
#>   first        last  
#>   <chr>        <chr> 
#> 1 HarlanNelson <NA>  
#> 2 Another      Person
#> 3 someone      else

创建于2022-10-20由reprex package（v2.0.1）

r

来源：https://stackoverflow.com/questions/74142066/i-would-like-to-separate-a-field-using-tidyrseparate-and-keep-the-separator-and

1条答案

按热度按时间

mzsu5hc01#

这是我想出来的。
但这只是对https://stackoverflow.com/a/51415101/4629916上的答案的理解
来自@卡梅隆
并应用到我的问题上。

tidyr::tibble(myname = c("HarlanNelson", "Another Person", "someone else")) |>  
  tidyr::separate(col = myname, into = c("first", "last"), sep = "(?<=[[:lower:]])(?=[[:upper:]])", extra = 'merge', fill = 'right') |> 
  tidyr::separate(col = first, into = c("first", "last2"), sep = " ", fill = 'right', extra = 'merge') |> 
  dplyr::mutate(last = dplyr::coalesce(last, last2)) |>  
  dplyr::select(-last2)
#> # A tibble: 3 × 2
#>   first   last  
#>   <chr>   <chr> 
#> 1 Harlan  Nelson
#> 2 Another Person
#> 3 someone else

tidyr::tibble(myname = c("HarlanNelson", "Another Person", "someone else")) |>  
  tidyr::separate(col = myname, into = c("first", "last"), sep = "(?<!^)(?=[[:upper:]])", extra = 'merge', fill = 'right') |> 
  tidyr::separate(col = first, into = c("first", "last2"), sep = " ", extra = 'merge', fill = 'right') |> 
  dplyr::mutate(last = dplyr::coalesce(last, last2)) |> 
  dplyr::select(-last2)
#> # A tibble: 3 × 2
#>   first   last  
#>   <chr>   <chr> 
#> 1 Harlan  Nelson
#> 2 Another Person
#> 3 someone else

赞(0）回复(0）举报 2023-04-03

我来回答

我想分隔一个字段使用tidyr：分离和保留分隔符和使用负回看

1条答案

相关问题

热门标签

最新问答