长格式多个宽格式 Dataframe

imzjd6km  于 2022-12-27  发布在  其他
关注(0)|答案(2)|浏览(135)

this answer之后,我想知道是否有一种方法可以将宽格式的数据.帧(time1time2time1and2)转换为单个长格式的数据.帧,以实现下面的Desired_output

time1 = read.table(text = "
class id   order ac bc
1     1    s-c   1  2
", header = TRUE)

time2 =read.table(text = "
class id   order ac bc
1     1    s-c   3  4
", header = TRUE)

time1and2 = read.table(text = "
class id   order ex1S ex2S ex1C ex2C  k1   k2   t1  t2
1     1    s-c   8    5    6     1    400  600  30  50
", header = TRUE)
Desired_output = read.table(text = "
class id   order time DV score   k   t  ave_ex
1     1    s-c   1    ac 1      400  30 (8+5)/2 =6.5
1     1    s-c   1    bc 2      400  30 (8+5)/2 =6.5
1     1    s-c   2    ac 3      600  50 (6+1)/2 =3.5
1     1    s-c   2    bc 4      600  50 (6+1)/2 =3.5 
", header = TRUE)

其中ave_ex = ex的平均值。

mrzz3bfm

mrzz3bfm1#

首先,将它们设置为长格式:

time1 <- pivot_longer(time1, c(ac, bc), names_to = "DV", values_to = "score") %>% mutate(time = 1) 
time2 <- pivot_longer(time2, c(ac, bc), names_to = "DV", values_to = "score") %>% mutate(time = 2) 
time1and2 <- pivot_longer(time1and2, c(k1:t2), names_to = c(".value", "time"), names_sep = "(\\d)") %>% 
  mutate(time = 1:2)

然后,团结起来:

bind_rows(time1, time2) %>% left_join(time1and2) %>% 
  mutate(ave = case_when(
    time == 1 ~ mean(c(ex1S, ex2S)),
    time == 2 ~ mean(c(ex1C, ex2C))
  )) %>%
  select(-c(ex1S:ex2C))

#Joining, by = c("class", "id", "order", "time")
## A tibble: 4 × 9
#  class    id order DV    score  time     k     t   ave
#  <int> <int> <chr> <chr> <int> <dbl> <int> <int> <dbl>
#1     1     1 s-c   ac        1     1   400    30   6.5
#2     1     1 s-c   bc        2     1   400    30   6.5
#3     1     1 s-c   ac        3     2   600    50   3.5
#4     1     1 s-c   bc        4     2   600    50   3.5
yhived7q

yhived7q2#

    • 步骤1:**将acbc列堆叠到time1time2中,并按行绑定它们。
library(tidyverse)

df1 <- list(time1, time2) %>%
  map_dfr(~ pivot_longer(.x, ac:bc, names_to = "DV", values_to = "score"), .id = "time")

#   time  class    id order DV    score
#   <chr> <int> <int> <chr> <chr> <int>
# 1 1         1     1 s-c   ac        1
# 2 1         1     1 s-c   bc        2
# 3 2         1     1 s-c   ac        3
# 4 2         1     1 s-c   bc        4
    • 步骤2:**在time1and2中,首先需要将ex1S, ex2S, ex1C, ex2C重命名为ex11, ex21, ex12, ex22(这里我使用chartr()来实现),以便可以堆叠列对
  • (ex11, ex12)ex1
  • (ex21, ex22)ex2
  • (k1, k2)k
  • (t1, t2)t

同时。

df2 <- time1and2 %>%
  rename_with(~ chartr("SC", "12", .x), starts_with("ex")) %>%
  pivot_longer(
    ex11:t2,
    names_to = c(".value", "time"),
    names_pattern = "(.+)(.)"
  )

# time1and2 (Before)
#   class id order ex1S ex2S ex1C ex2C  k1  k2 t1 t2
# 1     1  1   s-c    8    5    6    1 400 600 30 50
#
# df2 (After)
#   class    id order time    ex1   ex2     k     t
# 1     1     1 s-c   1         8     5   400    30
# 2     1     1 s-c   2         6     1   600    50
    • 第3步:**按timeclassidorder将它们合并,并计算ex1ex2的平均值。
left_join(df1, df2) %>%
  mutate(ave_ex = (ex1 + ex2) / 2)

# # A tibble: 4 × 11
#   time  class    id order DV    score   ex1   ex2     k     t ave_ex
#   <chr> <int> <int> <chr> <chr> <int> <int> <int> <int> <int>  <dbl>
# 1 1         1     1 s-c   ac        1     8     5   400    30    6.5
# 2 1         1     1 s-c   bc        2     8     5   400    30    6.5
# 3 2         1     1 s-c   ac        3     6     1   600    50    3.5
# 4 2         1     1 s-c   bc        4     6     1   600    50    3.5

相关问题