R语言 按索引将多个列值粘贴在一起[重复]

2admgd59  于 2023-01-15  发布在  其他
关注(0)|答案(2)|浏览(201)
    • 此问题在此处已有答案**:

paste two data.table columns(4个答案)
concatenate values across columns in data.table, row by row(1个答案)
4天前关闭。
我需要在我的数据表中创建一个名为"combinations"的列,该列包含从第4列到数据表末尾的所有列值。我将对多个数据表使用这行代码,列数会因数据表而异,因此我并不总是知道最后一列的索引号。开始总是第4列。
我知道有一些函数使用多个列名就能很好地工作,但不使用多个列索引。有人知道如何做到这一点吗?
使用列名而不使用列索引的示例:

mycols<-c("apple", "orange", "banana")
data[, combinations:=paste(mycols, sep=", ")]

我尝试过的使用列索引但不起作用的例子:

ncols<-ncol(data)
my_cols <- data[ , c(4:ncols)] 
data[, combinations:=paste(mycols, sep=", ")]

示例数据

id  number  day apple  orange  banana  
1   35      2   red    orange  yellow
2   12      3   red    NA      yellow
3   47      5   NA     orange  yellow

我想要达到的最终结果

id  number  day apple  orange  banana  combinations
1   35      2   red    orange  yellow  red, orange, yellow
2   12      3   red    NA      yellow  red, NA, yellow
3   47      5   NA     orange  yellow  NA, orange, yellow
tzdcorbm

tzdcorbm1#

我们可能需要do.call

library(data.table)
data[, combinations := do.call(paste, c(.SD, sep = ", ")), .SDcols = 4:ncols]
  • 输出
> data
   id number day apple orange banana        combinations
1:  1     35   2   red orange yellow red, orange, yellow
2:  2     12   3   red   <NA> yellow     red, NA, yellow
3:  3     47   5  <NA> orange yellow  NA, orange, yellow

或者使用unite,它可以使用na.rm = TRUE删除NA元素

library(dplyr)
library(tidyr)
data %>% 
  unite(combinations, all_of(4:ncols), sep = ", ", na.rm = TRUE, remove = FALSE)
  • 输出
id number day        combinations apple orange banana
1:  1     35   2 red, orange, yellow   red orange yellow
2:  2     12   3         red, yellow   red   <NA> yellow
3:  3     47   5      orange, yellow  <NA> orange yellow

数据

data <- structure(list(id = 1:3, number = c(35L, 12L, 47L), day = c(2L, 
3L, 5L), apple = c("red", "red", NA), orange = c("orange", NA, 
"orange"), banana = c("yellow", "yellow", "yellow")), 
class = "data.frame", row.names = c(NA, 
-3L))
setDT(data)
72qzrwbm

72qzrwbm2#

使用dplyr,使用rowwise

library(dplyr)

df %>% 
  rowwise() %>% 
  mutate(combinations = list(c_across(4:ncol({{df}})))) %>% 
  data.frame()
  id number day apple orange banana        combinations
1  1     35   2   red orange yellow red, orange, yellow
2  2     12   3   red   <NA> yellow     red, NA, yellow
3  3     47   5  <NA> orange yellow  NA, orange, yellow
数据
df <- structure(list(id = 1:3, number = c(35L, 12L, 47L), day = c(2L, 
3L, 5L), apple = c("red", "red", NA), orange = c("orange", NA, 
"orange"), banana = c("yellow", "yellow", "yellow")), class = "data.frame", row.names = c(NA, 
-3L))

相关问题