R:如何解组数据框以获得显示总计的数据透视表

kyxcudwk  于 2023-04-27  发布在  其他
关注(0)|答案(1)|浏览(71)

我有一个数据集,看起来像这样

print(as_tibble(working_df), n = 3)

 ride_id          bike_type   started_at          ended_at            start_sta_name                    start_sta_id
  <chr>            <chr>       <dttm>              <dttm>              <chr>                             <chr>       
1 23697816035F9A8F docked_bike 2022-03-05 19:08:00 2022-03-29 15:43:00 Sheffield Ave & Fullerton Ave     TA1306000016
2 DC510E6F98003A94 docked_bike 2022-07-04 18:37:00 2022-07-27 00:32:00 Dusable Lake Shore Dr & Monroe St 13300       
3 578BA30BA1348F18 docked_bike 2022-01-01 01:00:00 2022-01-21 08:51:00 Millennium Park                   13008       
  end_sta_name                    end_sta_id                          start_lat start_lng end_lat end_lng user_type ride_length
  <chr>                           <chr>                                   <dbl>     <dbl>   <dbl>   <dbl> <chr>           <dbl>
1 Base - 2132 W Hubbard Warehouse Hubbard Bike-checking (LBS-WH-TEST)      41.9     -87.7    41.9   -87.7 Casual          34355
2 Green St & Randolph St*         chargingstx3                             41.9     -87.6    41.9   -87.6 Casual          32035
3 Fairfield Ave & Roosevelt Rd    KA1504000102                             41.9     -87.6    41.9   -87.7 Casual          29271
  day_of_week
  <ord>      
1 Saturday   
2 Monday     
3 Saturday

我想创建一个数据透视表,它提供每组用户的平均乘车长度和所有用户的平均乘车长度(“总计”)。我使用SQL创建了一个数据透视表,代码如下:

SELECT 
    COALESCE(user_type, 'combined') AS user_type,
    ROUND(AVG(TIMESTAMPDIFF(MINUTE, started_at, ended_at)), 2) AS avg_ride_length_min
FROM 
    bikes.work
GROUP BY
    user_type WITH ROLLUP;

[

]
在R中实现这一点的最佳方法是什么?我尝试了下面的代码,但它不起作用:

working_df %>% 
  select(user_type, ride_length) %>% 
  group_by(user_type) %>% 
  summarize(avg_ride_length = mean(ride_length)) %>% 
  ungroup() %>% 
  mutate(mean(ride_length))

我的研究表明rpivotTable::rpivotTable()可以工作,所以我安装了htmlwidgetsknitrrpivotTable包。我在Titanic数据集上测试了该函数,它工作正常,但当我在函数上加载我的 Dataframe 时,RStudio查看器面板变灰了。2可能是我的数据框太大了,它无法工作。3有没有替代的软件包?非常感谢你的建议。

pcww981p

pcww981p1#

在我的评论中有两种不同的解决方案:

set.seed(42)
working_df <- data.frame(user_type = c("A", "Z"),
           ride_length = rpois(10, 10))

bind_rows(
  working_df %>% group_by(user_type) %>% 
    summarize(avg_ride_length_min = mean(ride_length)), 
  working_df %>% group_by(user_type = "combined") %>% 
    summarize(avg_ride_length_min = mean(ride_length)))

bind_rows(working_df, working_df %>% mutate(user_type = "combined")) %>%
  # this is to put "combined" last
  mutate(user_type = factor(user_type) %>% 
           forcats::fct_relevel("combined", after = Inf)) %>%
  # .by introduced in dplyr 1.1.0
  summarize(avg_ride_length_min = mean(ride_length), .by = "user_type")

  user_type avg_ride_length_min
1         A                13.2
2         Z                 9.4
3  combined                11.3

相关问题