将具有不同列和行号的 Dataframe 列表转换为R中的三维数组

tag5nh1u 于 2023-05-04 发布在其他

关注(0)|答案(2)|浏览(265)

df1<-data.frame(ID=1:3,test=2:4,category=1)   
df2<-data.frame(ID=1:2,test=4:5,test1=2:3,category=2)
Mylist<-list(df1,df2)

我尽力了

array1<-array(unlist(Mylist))

预期结果是array 1如下所示

array1[ , ,1]
      [,1]      [,2]        [,3]     
[1,] 1           2           1 
[2,] 2           3           1 
[3,] 3           4           1 

array1[ , ,2]
      [,1]      [,2]      [,3]      [,4]     
[1,]   1          4         2         2
[2,]   2          5         3         2

但是我不知道如何指定数组的维度，因为每个矩阵/数据框都有不同的列/行号。
谢谢你的帮助。

来源：https://stackoverflow.com/questions/76156050/convert-list-of-dataframes-with-different-col-and-row-numbers-into-an-3-dimensi

2条答案

按热度按时间

y1aodyip1#

一种策略是使用abind。
1.我们确定列表中的最大行数和列数
1.用NA填充每个矩阵/df
1.最后，使用abind将列表转换为三维数组

# given list
A1 <- matrix(runif(500), 25, 20)
A2 <- matrix(runif(483), 23, 21)
A3 <- matrix(runif(600), 24, 25)
MyList <- list(A1, A2, A3)

# 1. get max rows and cols:
max_rows <- max(sapply(MyList, nrow))
max_cols <- max(sapply(MyList, ncol))

# 2. pad with NA
MyList <- lapply(MyList, function(x) {
  rows_missing <- max_rows - nrow(x)
  cols_missing <- max_cols - ncol(x)
  padded <- matrix(NA, nrow = max_rows, ncol = max_cols)
  padded[1:nrow(x), 1:ncol(x)] <- x
  padded
})

# 3. get 3-dimensional array
array1 <- abind(MyList, along = 3)

赞(0）回复(0）举报 2023-05-04

31moq8wy2#

数组不能被打乱。如果a是结果数组，则a[,,1]和a[,,2]必须具有相同的维度。

- 1）**有了这个警告，创建输入列表L，并从中计算IDs的集合。将L的每个组件与IDs合并，并将结果行绑定到单个宽表单数据框中。使用pivot_longer将其转换为长格式，然后使用tapply将其转换为数组。这会将dnn变量放入dimnames而不是数据中。如果您更喜欢将它们放在数据中，请参见（2）。

library(dplyr)
library(tidyr)

# inputs
dnn <- c("ID", "test", "category") # desired order of dimensions
L <- list(df1, df2)  # main input

# same as merge except it defaults to all = TRUE
Merge <- function(..., all = TRUE) merge(..., all = all)

IDs <- L %>%
  lapply(function(x) x["ID"]) %>%
  Reduce(Merge, .) 

a <- L %>%
  lapply(Merge, IDs) %>%
  bind_rows %>%
  pivot_longer(!all_of(setdiff(dnn, "test")), names_to = "test") %>%
  with(tapply(value, .[dnn], c))
a

给予

, , category = 1

   test
ID  test test1
  1    2    NA
  2    3    NA
  3    4    NA

, , category = 2

   test
ID  test test1
  1    4     2
  2    5     3
  3   NA    NA

- 2）**另一种可能性如下。输出结果与问题中的输出结果更接近;但是，根据您的需要，可能首选（1）。我们计算IDs，然后计算宽格式 Dataframe DF，如（1）所示，然后使用`abind。

library(abind)
library(dplyr)

Merge <- function(..., all = TRUE) merge(..., all = all)

L <- list(df1, df2)
IDs <- Reduce(Merge, lapply(L, function(x) x["ID"]))
DF <- bind_rows(lapply(L, Merge, IDs))
nid <- nrow(IDs)
a <- abind(split(DF, rep(1:(nrow(DF) / nid), each = nid)), along = 3)
a

给出：

, , 1

     ID test category test1
[1,]  1    2        1    NA
[2,]  2    3        1    NA
[3,]  3    4        1    NA

, , 2

     ID test category test1
[1,]  1    4        2     2
[2,]  2    5        2     3
[3,]  3   NA       NA    NA

注意事项

如问题所示，输入包括

df1<-data.frame(ID=1:3,test=2:4,category=1)
df2<-data.frame(ID=1:2,test=4:5,test1=2:3,category=2)

更新

添加了（1）。

赞(0）回复(0）举报 2023-05-04

我来回答

将具有不同列和行号的 Dataframe 列表转换为R中的三维数组

2条答案

注意事项

更新

相关问题

热门标签

最新问答