R语言 如何将一行透视到多变量表中?

yrdbyhpb  于 2023-04-03  发布在  其他
关注(0)|答案(1)|浏览(104)

我从来不擅长数据透视表,所以我不确定我使用的术语是否非常准确,但这里是:
StackOverflow上的一些其他用户帮助我编写了一些代码,可以计算许多项目(在本例中是不同的食物)的大量风险比率。代码将这些比率沿着它们对应的食物打印在一长行中,如下所示:

# A tibble: 1 × 6
  thurs_brkf_banana thurs_brkf_orange thurs_lun_turkey thurs_lun_cheese thurs_din_spaghetti thurs_din_bread
              <dbl>             <dbl>            <dbl>            <dbl>               <dbl>           <dbl>
1               1.2               3.7              8.4              2.5                 4.5             1.7

正如你所看到的,这种格式对进一步的数据分析没有帮助,因为日期(thurs),膳食(brkf,lun,din)和食物类型都在同一个变量中。我更喜欢将它们分开,并将风险比作为单独的列。看起来像这样:

# A tibble: 6 × 4
  day   meal  food         rr
  <chr> <chr> <chr>     <dbl>
1 thurs brkf  banana      1.2
2 thurs brkf  orange      3.7
3 thurs lun   turkey      8.4
4 thurs lun   cheese      2.5
5 thurs din   spaghetti   4.5
6 thurs din   bread       1.7

任何帮助编码这将是有帮助的!我只是在Excel中手工输入这些不同的数据集,但我不知道如何在R中从一个切换到另一个。

pw136qt2

pw136qt21#

rr追加到列名的末尾,并使用pivot_longer

library(dplyr)
library(tidyr)
library(stringr)
df1 %>% 
  rename_with(~ str_c(.x, "_rr")) %>%
  pivot_longer(cols = everything(), 
   names_to = c("day", "meal", "food", ".value"), names_sep = "_")
  • 输出
# A tibble: 6 × 4
  day   meal  food         rr
  <chr> <chr> <chr>     <dbl>
1 thurs brkf  banana      1.2
2 thurs brkf  orange      3.7
3 thurs lun   turkey      8.4
4 thurs lun   cheese      2.5
5 thurs din   spaghetti   4.5
6 thurs din   bread       1.7

或者另一种选择是执行标准的pivot_longer并分隔名称列

pivot_longer(df1, cols = everything(), values_to = 'rr') %>%
  separate_wider_delim(name, delim = "_",
    names = c("day", "meal", "food"))
# A tibble: 6 × 4
  day   meal  food         rr
  <chr> <chr> <chr>     <dbl>
1 thurs brkf  banana      1.2
2 thurs brkf  orange      3.7
3 thurs lun   turkey      8.4
4 thurs lun   cheese      2.5
5 thurs din   spaghetti   4.5
6 thurs din   bread       1.7

数据

df1 <- structure(list(thurs_brkf_banana = 1.2, thurs_brkf_orange = 3.7, 
    thurs_lun_turkey = 8.4, thurs_lun_cheese = 2.5, thurs_din_spaghetti = 4.5, 
    thurs_din_bread = 1.7), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L))

相关问题