在R中合并两个表;列名与A和B选项不同

zazmityj  于 2023-04-03  发布在  其他
关注(0)|答案(1)|浏览(138)

我有两个数据集,我正试图合并在一起。第一个包含每个测试主题的信息,具有唯一的ID第二组包含每个测试对象的测量值(以列表示),但是每个受试者都测量了两次,因此唯一ID读取“IDa和IDb”。我想找到一种基于唯一ID合并这两个表的方法,而不管它是测量A还是B。
这里是两个数据集的一个小样本,以及预期输出的表格。任何帮助都将不胜感激!

UniqueID        Site        State       Age     Height  
Tree001           FK           OR         23    70  
Tree002           FK           OR         45    53  
Tree003           NM           OR         35    84  

UniqueID    Tree001A    Tree001B    Tree002A    Tree002B    Tree003A    Tree003B  
1996    4       2           
1997    7   8   7       3   
1998    3   2   9   4   7   
1999    11  9   2   12  3   13  
2010    8   8   4   6   11  4  
2011    10  5   6   3   8   9

UniqueID    Tree001A    Tree001B    Tree002A    Tree002B    Tree003A    Tree003B  
Site    FK  FK  FK  FK  NM  NM  
State   OR  OR  OR  OR  OR  OR  
Age     23  23  45  45  35  35  
Height  70  70  53  53  84  84  
1996    4       2             
1997    7   8   7       3     
1998    3   2   9   4   7     
1999    11  9   2   12  3   13  
2010    8   8   4   6   11  4  
2011    10  5   6   3   8   9
wfveoks0

wfveoks01#

这可以是一种方法。

df1 <- structure(list(UniqueID = structure(1:3, .Label = c("Tree001", 
"Tree002", "Tree003"), class = "factor"), Site = structure(c(1L, 
1L, 2L), .Label = c("FK", "NM"), class = "factor"), State = structure(c(1L, 
1L, 1L), .Label = "OR", class = "factor"), Age = c(23L, 45L, 
35L), Height = c(70L, 53L, 84L)), .Names = c("UniqueID", "Site", 
"State", "Age", "Height"), class = "data.frame", row.names = c(NA, 
-3L))

df2 <- structure(list(UniqueID = c(1996L, 1997L, 1998L, 1999L, 2010L, 
2011L), Tree001A = c(4L, 7L, 3L, 11L, 8L, 10L), Tree001B = c(NA, 
8L, 2L, 9L, 8L, 5L), Tree002A = c(2L, 7L, 9L, 2L, 4L, 6L), Tree002B = c(NA, 
NA, 4L, 12L, 6L, 3L), Tree003A = c(NA, 3L, 7L, 3L, 11L, 8L), 
    Tree003B = c(NA, NA, NA, 13L, 4L, 9L)), .Names = c("UniqueID", 
"Tree001A", "Tree001B", "Tree002A", "Tree002B", "Tree003A", "Tree003B"
), class = "data.frame", row.names = c(NA, -6L))

    > df1
  UniqueID Site State Age Height
1  Tree001   FK    OR  23     70
2  Tree002   FK    OR  45     53
3  Tree003   NM    OR  35     84
> df2
  UniqueID Tree001A Tree001B Tree002A Tree002B Tree003A Tree003B
1     1996        4     <NA>        2     <NA>     <NA>     <NA>
2     1997        7        8        7     <NA>        3     <NA>
3     1998        3        2        9        4        7     <NA>
4     1999       11        9        2       12        3       13
5     2010        8        8        4        6       11        4
6     2011       10        5        6        3        8        9

# Use transpose function to change df1 
df3 <- as.data.frame(t(df1[,-1]))

colnames(df3) <- df1[,1]

# Change rownames to UniqueID
df3$UniqueID <- rownames(df3)

# ROwnames to numeric
rownames(df3) <- c(1:4)

# Modify dataframe so that you have two columns for each subject
df3 <- df3[,c(4,1,1,2,2,3,3)]
colnames(df3) <- c("UniqueID", "Tree001A", "Tree001B", "Tree002A",
                   "Tree002B", "Tree003A", "Tree003B")

# Change classes of columns of df2 to factor
df2 <- data.frame(sapply(df2,function(x) class(x)<- as.factor(x)))

# Now combine two data frames
new <- rbind(df3,df2)
> new
   UniqueID Tree001A Tree001B Tree002A Tree002B Tree003A Tree003B
1      Site       FK       FK       FK       FK       NM       NM
2     State       OR       OR       OR       OR       OR       OR
3       Age       23       23       45       45       35       35
4    Height       70       70       53       53       84       84
5      1996        4     <NA>        2     <NA>     <NA>     <NA>
6      1997        7        8        7     <NA>        3     <NA>
7      1998        3        2        9        4        7     <NA>
8      1999       11        9        2       12        3       13
9      2010        8        8        4        6       11        4
10     2011       10        5        6        3        8        9

相关问题