需要在R中连接两个嵌套框并合并数据

qrjkbowd 于 11个月前发布在其他

关注(0)|答案(3)|浏览(97)

我有这个问题：我必须为不同日期的两个鸟类计数点使用不同的框架。我需要将框架合并组合起来，而且，不同框架的列名称不匹配。我的数据框架是这样的：
DF1
| Especies|孔泰奥|
| --|--|
| （Crypturellus tataupa）| 1 |
| （皮亚亚卡亚那）| 2 |
DF2
| 物种|计数|
| --|--|
| （Crypturellus tataupa）| 3 |
| （Celeus flavescens）| 1 |
我已经尝试了整套merge，并不是我所需要的。
我需要两个不同的输出：

一个完整的合并，我最终有一个完整的计数，每一个物种
| 物种|计数|
| --|--|
| （Crypturellus tataupa）| 4 |
| （皮亚亚卡亚那）| 2 |
| （Celeus flavescens）| 1 |
合并的物种的合并和每个计数的不同列
| Especies| Count1| Count2|
| --|--|--|
| （Crypturellus tataupa）| 1 | 3 |
| （皮亚亚卡亚那）| 2 | 0 |
| （Celeus flavescens）| 0 | 1 |
你可能已经知道了，我在R方面没有太多的经验。提前谢谢你。

来源：https://stackoverflow.com/questions/77389130/need-to-join-two-dataframes-in-r-and-combine-data

3条答案

按热度按时间

q7solyqu1#

您可以尝试下面的代码来分别获得out1和out2

# initial merged output
out <- merge(
  df1,
  df2,
  by.x = "Especies",
  by.y = "Species",
  all = TRUE
)

# first output
out1 <- transform(
  out,
  Count = rowSums(out[-1], na.rm = TRUE)
)[-2]

# second output
out2 <- cbind(
  out[1],
  setNames(
    replace(out[-1], is.na(out[-1]), 0),
    paste0("Count", seq_along(out[-1]))
  )
)

字符串
哪里

> out1
                Especies Count
1    (Celeus flavescens)     1
2 (Crypturellus tataupa)     4
3         (Piaya cayana)     2

> out2
                Especies Count1 Count2
1    (Celeus flavescens)      0      1
2 (Crypturellus tataupa)      1      3
3         (Piaya cayana)      2      0

型

数据

> dput(df1)
structure(list(Especies = c("(Crypturellus tataupa)", "(Piaya cayana)"
), Conteo = 1:2), class = "data.frame", row.names = c(NA, -2L
))

> dput(df2)
structure(list(Species = c("(Crypturellus tataupa)", "(Celeus flavescens)"
), Count = c(3L, 1L)), class = "data.frame", row.names = c(NA,
-2L))

型

赞(0）回复(0）举报 11个月前

mum43rcc2#

作为第一步，我建议重命名其中一个数据框中的变量以匹配另一个数据框。

df1

structure(list(Especies = c("(Crypturellus tataupa)", "(Piaya cayana)"
), Conteo = c(1, 2)), class = "data.frame", row.names = c(NA, 
-2L))

字符串

df2

structure(list(Species = c("(Crypturellus tataupa)", "(Celeus flavescens)"
), Count = c(3, 1)), class = "data.frame", row.names = c(NA, 
-2L))

library(dplyr)

df1 <- df1 %>%
  rename(Species = Especies,
         Count = Conteo)

输出1

library(dplyr)
library(tidyr)

bind_rows(df1, df2) %>%
  summarize(Count = sum(Count),
            .by = Species)
#>                  Species Count
#> 1 (Crypturellus tataupa)     4
#> 2         (Piaya cayana)     2
#> 3    (Celeus flavescens)     1

型

输出2

full_join(df1, df2, join_by(Species)) %>%
  mutate(across(starts_with('Count'), ~replace_na(., 0)))
#>                  Species Count.x Count.y
#> 1 (Crypturellus tataupa)       1       3
#> 2         (Piaya cayana)       2       0
#> 3    (Celeus flavescens)       0       1

型

赞(0）回复(0）举报 11个月前

pcww981p3#

我也建议大家对常用的栏目（品种）保持相同的名称，这是一个避免错误的好习惯。

输出
首次输出

我建议您使用full_join更改计数列的名称，并将NAs值更改为零值。

df <- df1 %>% rename(count1=count) %>% full_join(df2 %>% rename(count2=count),by="species") %>% mutate_all(~ ifelse(is.na(.x),0,.x))
df
species count1 count2
1       F     10      5
2       E      8      3
3       B      3      6
4       C      7      0
5       D     10      3
6       A      0      4

字符串

第二次输出

易于应用列 count1 和 count2 的总和

df %>% mutate(count = count1 + count2) %>% select(species,count)
species count
1       F    15
2       E    11
3       B     9
4       C     7
5       D    13
6       A     4

型

资料

df1 <- data.frame(species=sample(c("A","B","C","D","E","F"),5,replace=FALSE),count=sample(seq(1,10),5,replace=TRUE))
df2 <- data.frame(species=sample(c("A","B","C","D","E","F"),5,replace=FALSE),count=sample(seq(1,10),5,replace=TRUE))

型

赞(0）回复(0）举报 11个月前

我来回答

需要在R中连接两个嵌套框并合并数据

3条答案

数据

df1

df2

输出1

输出2

相关问题

热门标签

最新问答