R语言按组获取ggplot中geom_col的行列

vlurs2pr 于 2023-02-10 发布在其他

关注(0)|答案(2)|浏览(156)

我试图通过不同得分水平的人口统计数据来计算行百分比--在我的数据中，这将是%的白人（或%的黑人，或%的男性，或%的受教育程度为2的人，等等）得分为0（或1、2或3）--然后使用它来创建一个大图。
因此，在我下面的示例数据中，人种== 1（即白人）中8.33%的人得分为0，25%的人得分为1，25%的人得分为2，41.67%的人得分为3。
然后，最终目标将是得到某种类型的条形图，其中4个水平的'分数'是横跨x轴，人口统计的各种比较运行下来的y轴。一些看起来像这样的视觉效果，但与'分数'的水平，而不是教育水平的顶部：

.
我已经有了一些代码来生成实际的数字，我已经在其他示例中使用了外部/已经计算过的百分比：

ggplot(data, aes(x = percent, y = category, fill = group)) +
  geom_col(orientation = "y", width = .9) +
  facet_grid(group~score_var, 
             scales = "free_y", space = "free_y") +
  labs(title = "Demographic breakdown of 'Score'") +
  theme_bw()

我正在努力找出计算这些行百分比的最佳方法，大概是使用group_by()和summarize，然后以一种可以绘制它们的方式存储或配置它们。

d <- structure(list(race = c(1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3, 1, 
1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 
3, 3), gender = c(0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 
0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1
), education = c(1, 3, 3, 2, 1, 3, 2, 3, 4, 4, 2, 3, 3, 2, 3, 
4, 1, 3, 1, 3, 3, 2, 1, 3, 2, 3, 4, 4, 2, 3, 3, 2, 3, 4, 1, 3
), score = c(1, 2, 2, 1, 2, 3, 3, 2, 0, 0, 1, 2, 1, 3, 0, 0, 
3, 3, 3, 3, 3, 3, 3, 3, 2, 1, 2, 3, 1, 3, 3, 0, 1, 2, 2, 0)), row.names = c(NA, 
-36L), spec = structure(list(cols = list(race = structure(list(), class = c("collector_double", 
"collector")), gender = structure(list(), class = c("collector_double", 
"collector")), education = structure(list(), class = c("collector_double", 
"collector")), score = structure(list(), class = c("collector_double", 
"collector"))), default = structure(list(), class = c("collector_guess", 
"collector")), delim = ","), class = "col_spec"), problems = <pointer: 0x000001bd978b0df0>, class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"))

来源：https://stackoverflow.com/questions/75367230/get-row-columns-by-group-for-geom-col-in-ggplot

2条答案

按热度按时间

flseospp1#

这可能会帮助您开始：

library(dplyr)
library(ggplot2)
prop <- data %>% 
    mutate(race = factor(race, levels = c(1, 2, 3), labels = c("White", "Black", "Others"))) %>% 
    group_by(race) %>% 
    mutate(race_n  = n()) %>% 
    group_by(race, score) %>% 
    summarise(percent = round(100*n()/race_n[1], 1))

prop %>% 
    ggplot(aes(x = percent, y = score, fill = race)) +
    geom_col(orientation = "y", width = .9) +
    geom_text(aes(label = percent), hjust = 1)+
    facet_grid(~race) +
    labs(title = "Demographic breakdown of 'Score'") +
    theme_bw()

编辑

将所有字符放在一起，可以得到一个更大的图形：

df <- data %>% mutate(
        gender = factor(2-gender), 
        race = factor(race), 
        education = factor(education)) %>%
    pivot_longer(!score, names_to = "character", values_to = "levels")

df %>% group_by(character, levels) %>% 
    mutate(group_n  = n()) %>% 
    group_by(character, levels, score) %>% 
    summarise(percent = round(100*n()/group_n[1], 1)) %>% 
    ggplot(aes(x = percent, y = score, fill = character)) +
    geom_col(orientation = "y", width = .9) +
    geom_text(aes(label = percent), hjust = 1)+
    facet_grid(character ~ levels) +
    labs(title = "Demographic breakdown of 'Score'") +
    theme_bw()

请注意：我已经改变了性别的代码。

赞(0）回复(0）举报 2023-02-10

db2dz4w82#

从@王志强出色的第一关中得到灵感，我终于想出了一个解决办法。我仍然需要改变标签的顺序（把教育水平按顺序排列，把种族变量移到图的顶部），但这基本上是我所设想的。

d_test <- d %>% mutate(
        gender = factor(2-gender), 
        race = factor(race), 
        education = factor(education)) %>%
    pivot_longer(!score, names_to = "group", values_to = "levels")

d_test <- d_test %>% group_by(group, levels) %>% 
    mutate(group_n  = n()) %>% 
    group_by(group, levels, score) %>% 
    summarise(percent = round(100*n()/group_n[1], 1))

d_test <- d_test %>% 
  mutate(var = case_when(group == "gender" & levels == 1 ~ "female",
                         group == "gender" & levels == 2 ~ "male",
                         group == "race" & levels == 1 ~ "white",
                         group == "race" & levels == 2 ~ "black",
                         group == "race" & levels == 3 ~ "hispanic",
                         group == "education" & levels == 1 ~ "dropout HS",
                         group == "education" & levels == 2 ~ "grad HS",
                         group == "education" & levels == 3 ~ "some coll",
                         group == "education" & levels == 4 ~ "grad coll"))

ggplot(d_test, aes(x = percent, y = var, fill = group)) +
  geom_col(orientation = "y", width = .9) +
  facet_grid(group ~ score,
               scales = "free_y", space = "free_y") +
  labs(title = "Demographic breakdown of 'Score'",
         y = "",
         x = "Percent") +
  theme_minimal() +
  theme(legend.position = "none",
        strip.text.y = element_blank())

赞(0）回复(0）举报 2023-02-10

我来回答

R语言按组获取ggplot中geom_col的行列

2条答案

编辑

相关问题

热门标签

最新问答

R语言 按组获取ggplot中geom_col的行列

2条答案

编辑

相关问题

热门标签

最新问答

R语言按组获取ggplot中geom_col的行列