使用键将一个变量中的多个值替换为它们在R中的字符匹配

hk8txs48  于 2022-12-20  发布在  其他
关注(0)|答案(1)|浏览(93)

我有一个产品的 Dataframe (df),例如(df$id),其中包含一个由逗号分隔的1位或多位数字的变量(df$cat),每个数字对应于一个特定的“类别”。(一个产品可以分配给多个类别)。
我想使用键(key)将这些从数字更改为字符串

df <- data.frame(id=c("id1","id2","id3","id4","id5","id6","id7","id8"),
                 cat=c("0,2,6","0","2","2,6","4,6","6","6","6"))

> head(df)
   id   cat
1 id1 0,2,6
2 id2     0
3 id3     2
4 id4   2,6
5 id5   4,6
6 id6     6

key <- data.frame(cat=c("0","2","4","6"),
                 name=c("kitchen","bathroom","dining","hall"))

所以我最后会得到

df.d <- data.frame(id=c("id1","id2","id3","id4","id5","id6","id7","id8"),
                 cat=c("0,2,6","0","2","2,6","4,6","6","6","6"),
                 location=c("kitchen,bathroom,hall","kitchen","bathroom","bathroom,hall","dining,hall","hall","hall","hall"))

head(df.d)
   id   cat              location
1 id1 0,2,6 kitchen,bathroom,hall
2 id2     0               kitchen
3 id3     2              bathroom
4 id4   2,6         bathroom,hall
5 id5   4,6           dining,hall
6 id6     6                  hall
7 id7     6                  hall
8 id8     6                  hall

尝试使用dplyr的重新编码,但没有成功

ni65a41a

ni65a41a1#

一种方法:

library(dplyr)

## pulling TWO columns returns the 1. column as vector,
## named with the 2. column
key_vector <- key |> pull(name, cat)

df.d <- 
    df |>
    rowwise() |>
    mutate(location = paste(key_vector[unlist(strsplit(cat, ','))],
                            collapse = ', '))
df.d |> head(3)
# # A tibble: 3 x 3
# # Rowwise: 
#   id    cat   location               
#   <chr> <chr> <chr>                  
# 1 id1   0,2,6 kitchen, bathroom, hall
# 2 id2   0     kitchen                
# 3 id3   2     bathroom

相关问题