当我在R中对多个 Dataframe 进行完全连接时,如何从标题中删除.x.x.x....和.y.y.y...?

daupos2t  于 2023-01-15  发布在  其他
关注(0)|答案(3)|浏览(131)

文件1
| 国家|姓名|种族|A类|B|C级|
| - ------|- ------|- ------|- ------|- ------|- ------|
| ......|......|......|......|......|......|
文件2
| 国家|姓名|种族|D级|E级|F级|
| - ------|- ------|- ------|- ------|- ------|- ------|
| ......|......|......|......|......|......|
文件3
| 国家|姓名|种族|G级|高|我|
| - ------|- ------|- ------|- ------|- ------|- ------|
| ......|......|......|......|......|......|
文件4
| 国家|姓名|种族|J型|K|L型|
| - ------|- ------|- ------|- ------|- ------|- ------|
| ......|......|......|......|......|......|
上面是一些.csv Dataframe ,我将它们分配给一个名为file.list的变量,然后使用lapply。目的是将每个 Dataframe 完全连接成一个 Dataframe ,正如下面的代码所示。
file.list= c(file1.csv, file2.csv, file3.csv, file4.csv)
df.list <- lapply(file.list, read.csv)
data <-df.list %>% reduce(full_join, by=c("Country", "Name", "Race"))
资料
| 国家|姓名|种族|A.x|B.x|C.x|日|年|F.年|G.X.X|H. X. X|九、十、|日年|克、年、年|年|
| - ------|- ------|- ------|- ------|- ------|- ------|- ------|- ------|- ------|- ------|- ------|- ------|- ------|- ------|- ------|
| ......|......|......|......|......|......|......|......|......|......|......|......|......|......|......|
虽然使用上面的代码可以实现完全连接,但是标头的名称后面会添加一个.x或.y。
data<-power_full_join(df.list, by=c("Country", "Name", "Race"))
如何进行完全连接,以使头文件保留其原始名称,而后面没有. x.x...和. y.y..?

57hvy0tb

57hvy0tb1#

我们可以使用plyr::join_all

plyr::join_all(df_list, type = "full")
  • 输出
Country Name Race  A  B  C  D  E  F  G  H  I
1      Rf  wef  wed  1  1  1 NA NA NA NA NA NA
2      Ew eggw   qw  2  2  2 NA NA NA  4 11  8
3      Gw  wef  wed NA NA NA  3  5  7 NA NA NA
4      Wd eggw   qw NA NA NA  4  6  8 NA NA NA
5      Qp  wef  wed NA NA NA NA NA NA  3 10  7

数据

df_list <- list(structure(list(Country = c("Rf", "Ew"), Name = c("wef", 
"eggw"), Race = c("wed", "qw"), A = 1:2, B = 1:2, C = 1:2), class = "data.frame", row.names = c(NA, 
-2L)), structure(list(Country = c("Gw", "Wd"), Name = c("wef", 
"eggw"), Race = c("wed", "qw"), D = 3:4, E = 5:6, F = 7:8), row.names = c(NA, 
-2L), class = "data.frame"), structure(list(Country = c("Qp", 
"Ew"), Name = c("wef", "eggw"), Race = c("wed", "qw"), G = 3:4, 
    H = 10:11, I = 7:8), row.names = c(NA, -2L), class = "data.frame"))
nimxete2

nimxete22#

full_join()有一个参数suffix,可以将其设置为空字符串以实现此目的。

data <-df.list %>% reduce(full_join, by=c("Country", "Name", "Race"), suffix=c("",""))
cvxl0en2

cvxl0en23#

连接 Dataframe 列表的基R替换项。

示例
df_list
[[1]]
  Country Name Race A B C
1      Rf  wef  wed 1 1 1
2      Ew eggw   qw 2 2 2

[[2]]
  Country Name Race D E F
1      Gw  wef  wed 3 5 7
2      Wd eggw   qw 4 6 8

[[3]]
  Country Name Race G  H I
1      Qp  wef  wed 3 10 7
2      Ew eggw   qw 4 11 8
功能
join_list <- function(x, ax = T, ay = F){ 
  dff <- merge(x[[1]], x[[2]], all.x=ax, all.y=ay)
  if(length(x) > 2){
    for(i in seq_along(x)[3:length(x)]){ 
      dff <- merge(dff, x[[i]], all.x=ax, all.y=ay) 
  }}; dff }

使用

join_list(df_list, ax=T, ay=T)
  Country Name Race  A  B  C  D  E  F  G  H  I
1      Ew eggw   qw  2  2  2 NA NA NA  4 11  8
2      Gw  wef  wed NA NA NA  3  5  7 NA NA NA
3      Qp  wef  wed NA NA NA NA NA NA  3 10  7
4      Rf  wef  wed  1  1  1 NA NA NA NA NA NA
5      Wd eggw   qw NA NA NA  4  6  8 NA NA NA
数据
df_list <- list(structure(list(Country = c("Rf", "Ew"), Name = c("wef",
"eggw"), Race = c("wed", "qw"), A = 1:2, B = 1:2, C = 1:2), class = "data.frame", row.names = c(NA,
-2L)), structure(list(Country = c("Gw", "Wd"), Name = c("wef",
"eggw"), Race = c("wed", "qw"), D = 3:4, E = 5:6, F = 7:8), row.names = c(NA,
-2L), class = "data.frame"), structure(list(Country = c("Qp",
"Ew"), Name = c("wef", "eggw"), Race = c("wed", "qw"), G = 3:4,
    H = 10:11, I = 7:8), row.names = c(NA, -2L), class = "data.frame"))

相关问题