R语言字符串拆分的第一个条目

wmtdaxz3 于 2023-02-20 发布在其他

关注(0)|答案(7)|浏览(254)

我有一个列people$food，其中包含chocolate或apple-orange-strawberry这样的条目。
我想按-拆分people$food，并从拆分中获得第一个条目。
在python中，解应该是food.split('-')[0]，但是我找不到R的等价物。

来源：https://stackoverflow.com/questions/33683862/first-entry-from-string-split

7条答案

按热度按时间

5jdjgkvh1#

如果需要从每个拆分中提取第一个（或nth）条目，请用途：

word <- c('apple-orange-strawberry','chocolate')

sapply(strsplit(word,"-"), `[`, 1)
#[1] "apple"     "chocolate"

或者更快更明确地说：

vapply(strsplit(word,"-"), `[`, 1, FUN.VALUE=character(1))
#[1] "apple"     "chocolate"

这两段代码都可以很好地科普在拆分列表中选择的值，并且可以处理超出范围的情况：

vapply(strsplit(word,"-"), `[`, 2, FUN.VALUE=character(1))
#[1] "orange" NA

赞(0）回复(0）举报 2023-02-20

bakd9h0s2#

例如

word <- 'apple-orange-strawberry'

strsplit(word, "-")[[1]][1]
[1] "apple"

或者等价地

unlist(strsplit(word, "-"))[1].

本质上，其思想是split给出一个列表作为结果，其元素必须通过分片（前一种情况）或取消列表（后一种情况）来访问。
如果要将方法应用于整列：

first.word <- function(my.string){
    unlist(strsplit(my.string, "-"))[1]
}

words <- c('apple-orange-strawberry', 'orange-juice')

R: sapply(words, first.word)
apple-orange-strawberry            orange-juice 
                "apple"                "orange"

赞(0）回复(0）举报 2023-02-20

k3fezbri3#

我会用sub()来代替，因为你想要在拆分之前的第一个单词，我们可以简单地删除第一个-之后的所有单词，这就是我们剩下的。

sub("-.*", "", people$food)

我举个例子-

x <- c("apple", "banana-raspberry-cherry", "orange-berry", "tomato-apple")
sub("-.*", "", x)
# [1] "apple"  "banana" "orange" "tomato"

否则，如果要使用strsplit()，可以使用vapply()对前几个元素进行舍入

vapply(strsplit(x, "-", fixed = TRUE), "[", "", 1)
# [1] "apple"  "banana" "orange" "tomato"

赞(0）回复(0）举报 2023-02-20

qvtsj1bj4#

我建议在R中使用head而不是[。

word <- c('apple-orange-strawberry','chocolate')
sapply(strsplit(word, "-"), head, 1)
# [1] "apple"     "chocolate"

赞(0）回复(0）举报 2023-02-20

6pp0gazn5#

dplyr/magrittr方法：

library(magrittr)
library(dplyr)

word = c('apple-orange-strawberry', 'chocolate')

strsplit(word, "-") %>% sapply(extract2, 1)
# [1] "apple"     "chocolate"

赞(0）回复(0）举报 2023-02-20

dgsult0t6#

使用str_remove()删除模式之后的所有内容：

df <- data.frame(words = c('apple-orange-strawberry', 'chocolate'))

mutate(df, short = stringr::str_remove(words, "-.*")) # mutate method

stringr::str_remove(df$words, "-.*")           # str_remove example

stringr::str_replace(df$words, "-.*", "")      # str_replace method

stringr::str_split_fixed(df$words, "-", n=2)[,1]        # str_split method similar to original question's Python code

tidyr::separate(df, words, into = c("short", NA)) # using the separate function

words       short
1 apple-orange-strawberry       apple
2               chocolate   chocolate

赞(0）回复(0）举报 2023-02-20

mzmfm0qo7#

stringr 1.5.0引入了str_split_i来轻松实现这一点：

library(stringr)

str_split_i(c('apple-orange-strawberry','chocolate'), "-", 1)
[1] "apple"     "chocolate"

第三个参数表示要提取的索引。另外一个新特性是可以使用负值从右侧开始索引：

str_split_i(c('apple-orange-strawberry','chocolate'), "-", -1)
[1] "strawberry" "chocolate"

赞(0）回复(0）举报 2023-02-20

我来回答

R语言字符串拆分的第一个条目

7条答案

相关问题

热门标签

最新问答

R语言 字符串拆分的第一个条目

7条答案

相关问题

热门标签

最新问答

R语言字符串拆分的第一个条目