R语言如何从整列中移除特定符号

nhjlsmyf 于 2022-12-06 发布在其他

关注(0)|答案(2)|浏览(370)

我想知道如何删除整列的特定符号。下面是原始数据的外观：original data。
我唯一想得到的元素是第一个单词。
以下是我的完整数据集：
以下是数据背景信息

library("dplyr")
library("stringr")
library("tidyverse")
library("ggplot2")

# load the .csv into R studio, you can do this 1 of 2 ways
#read.csv("the name of the .csv you downloaded from kaggle")
spotiify_origional <- read.csv("charts.csv")
spotiify_origional <- read.csv("https://raw.githubusercontent.com/info201a-au2022/project-group-1-section-aa/main/data/charts.csv")
View(spotiify_origional)
# filters down the data
# removes the track id, explicit, and duration columns
spotify_modify <- spotiify_origional %>% 
  select(name, country, date, position, streams, artists, genres = artist_genres)

#returns all the data just from 2022
#this is the data set you should you on the project
spotify_2022 <- spotify_modify %>% 
  filter(date >= "2022-01-01") %>% 
  arrange(date) %>% 
  group_by(date)

spotify_2022_global <- spotify_modify %>% 
  filter(date >= "2022-01-01") %>% 
  filter(country == "global") %>% 
  arrange(date) %>% 
  group_by(streams)
View(spotify_2022_global)

这就是我所做的

top_15 <- spotify_2022_global[order(spotify_2022_global$streams, decreasing = TRUE), ]
top_15 <- top_15[1:15,]
top_15$streams <- as.numeric(top_15$streams)
View(top_15)  

top_15 <- top_15 %>% 
  separate(genres, c("genres"), sep = ',')
top_15$genres<-gsub("]","",as.character(top_15$genres))
View(top_15)

而现在这个名字看起来是这样的：
name now look like this
我试着使用相同的gsub函数删除其余的括号和引号，但没有成功。
我想知道在这一点上我应该做什么？任何建议都会有很大的帮助！谢谢！

r

来源：https://stackoverflow.com/questions/74625659/how-can-i-remove-a-specific-symbol-from-for-an-entire-column

2条答案

按热度按时间

wi3ka0sx1#

您可以使用sub的组合来执行此操作，以使用string::word()删除不需要的字符，这是提取单词的一个很好的方法。

w <- "[firstWord, secondWord, thirdWord]"

stringr::word(gsub('[\\[,\']', '', w),1)
#> [1] "firstWord"

这也适用于w <- "['firstWord', 'secondWord', 'thirdWord']"。

赞(0）回复(0）举报 2022-12-06

dbf7pr2w2#

top_15$genres <- gsub("]|\\[|[']","",as.character(top_15$genres))

其中，正则表达式"]|\\[|[']"使用|字符OR来匹配多个内容，即：

]右方括号
\\[左方括号
[']单引号

整理"This is what I did“（这就是我所做的）代码，您将得到：

spotify_2022_global %>% 
  arrange(desc(streams)) %>% 
  head(15) %>%
  mutate(streams = as.numeric(streams),
         genres = gsub("]|\\[|[']|,","",genres),  # remove brackets and quote marks
         genres = str_split(genres, ",")[[1]][1]))  # get first word from list

给出：

赞(0）回复(0）举报 2022-12-06

我来回答

R语言如何从整列中移除特定符号

2条答案

相关问题

热门标签

最新问答

R语言 如何从整列中移除特定符号

2条答案

相关问题

热门标签

最新问答

R语言如何从整列中移除特定符号