长格式多个宽格式 Dataframe

imzjd6km 于 2022-12-27 发布在其他

关注(0)|答案(2)|浏览(135)

在this answer之后，我想知道是否有一种方法可以将宽格式的数据.帧（time1、time2、time1and2）转换为单个长格式的数据.帧，以实现下面的Desired_output？

time1 = read.table(text = "
class id   order ac bc
1     1    s-c   1  2
", header = TRUE)

time2 =read.table(text = "
class id   order ac bc
1     1    s-c   3  4
", header = TRUE)

time1and2 = read.table(text = "
class id   order ex1S ex2S ex1C ex2C  k1   k2   t1  t2
1     1    s-c   8    5    6     1    400  600  30  50
", header = TRUE)

Desired_output = read.table(text = "
class id   order time DV score   k   t  ave_ex
1     1    s-c   1    ac 1      400  30 (8+5)/2 =6.5
1     1    s-c   1    bc 2      400  30 (8+5)/2 =6.5
1     1    s-c   2    ac 3      600  50 (6+1)/2 =3.5
1     1    s-c   2    bc 4      600  50 (6+1)/2 =3.5 
", header = TRUE)

其中ave_ex = ex的平均值。

来源：https://stackoverflow.com/questions/74917144/long-formatting-multiple-wide-format-data-frames

2条答案

按热度按时间

mrzz3bfm1#

首先，将它们设置为长格式：

time1 <- pivot_longer(time1, c(ac, bc), names_to = "DV", values_to = "score") %>% mutate(time = 1) 
time2 <- pivot_longer(time2, c(ac, bc), names_to = "DV", values_to = "score") %>% mutate(time = 2) 
time1and2 <- pivot_longer(time1and2, c(k1:t2), names_to = c(".value", "time"), names_sep = "(\\d)") %>% 
  mutate(time = 1:2)

然后，团结起来：

bind_rows(time1, time2) %>% left_join(time1and2) %>% 
  mutate(ave = case_when(
    time == 1 ~ mean(c(ex1S, ex2S)),
    time == 2 ~ mean(c(ex1C, ex2C))
  )) %>%
  select(-c(ex1S:ex2C))

#Joining, by = c("class", "id", "order", "time")
## A tibble: 4 × 9
#  class    id order DV    score  time     k     t   ave
#  <int> <int> <chr> <chr> <int> <dbl> <int> <int> <dbl>
#1     1     1 s-c   ac        1     1   400    30   6.5
#2     1     1 s-c   bc        2     1   400    30   6.5
#3     1     1 s-c   ac        3     2   600    50   3.5
#4     1     1 s-c   bc        4     2   600    50   3.5

赞(0）回复(0）举报 2022-12-27

yhived7q2#

- 步骤1：**将ac和bc列堆叠到time1和time2中，并按行绑定它们。

library(tidyverse)

df1 <- list(time1, time2) %>%
  map_dfr(~ pivot_longer(.x, ac:bc, names_to = "DV", values_to = "score"), .id = "time")

#   time  class    id order DV    score
#   <chr> <int> <int> <chr> <chr> <int>
# 1 1         1     1 s-c   ac        1
# 2 1         1     1 s-c   bc        2
# 3 2         1     1 s-c   ac        3
# 4 2         1     1 s-c   bc        4

- 步骤2：**在time1and2中，首先需要将ex1S, ex2S, ex1C, ex2C重命名为ex11, ex21, ex12, ex22（这里我使用chartr()来实现），以便可以堆叠列对
(ex11, ex12)至ex1
(ex21, ex22)至ex2
(k1, k2)至k
(t1, t2)至t

同时。

df2 <- time1and2 %>%
  rename_with(~ chartr("SC", "12", .x), starts_with("ex")) %>%
  pivot_longer(
    ex11:t2,
    names_to = c(".value", "time"),
    names_pattern = "(.+)(.)"
  )

# time1and2 (Before)
#   class id order ex1S ex2S ex1C ex2C  k1  k2 t1 t2
# 1     1  1   s-c    8    5    6    1 400 600 30 50
#
# df2 (After)
#   class    id order time    ex1   ex2     k     t
# 1     1     1 s-c   1         8     5   400    30
# 2     1     1 s-c   2         6     1   600    50

- 第3步：**按time、class、id和order将它们合并，并计算ex1和ex2的平均值。

left_join(df1, df2) %>%
  mutate(ave_ex = (ex1 + ex2) / 2)

# # A tibble: 4 × 11
#   time  class    id order DV    score   ex1   ex2     k     t ave_ex
#   <chr> <int> <int> <chr> <chr> <int> <int> <int> <int> <int>  <dbl>
# 1 1         1     1 s-c   ac        1     8     5   400    30    6.5
# 2 1         1     1 s-c   bc        2     8     5   400    30    6.5
# 3 2         1     1 s-c   ac        3     6     1   600    50    3.5
# 4 2         1     1 s-c   bc        4     6     1   600    50    3.5

赞(0）回复(0）举报 2022-12-27

我来回答

长格式多个宽格式 Dataframe

2条答案

相关问题

热门标签

最新问答