R语言 如何创建显示计数的ggplot2 100%水平堆叠条形图?

nhjlsmyf  于 2023-02-14  发布在  其他
关注(0)|答案(2)|浏览(285)

我正在处理一项关于职场文化几个因素的调查数据,目前的长格式是tibble,名为work_culture_data,如下所示:

> print(work_culture_data, n = 21)
# A tibble: 140 × 3
   Response_ID Factor                    Level    
         <int> <fct>                     <fct>    
 1           6 Level_support_colleagues  low      
 2           6 Level_support_community   low      
 3           6 Level_career_prospects    low      
 4           6 Level_career_satisfaction high     
 5           6 Level_career_impact       low      
 6           6 Level_collaboration       high     
 7           6 Level_assessment_fairness high     
 8           7 Level_support_colleagues  high     
 9           7 Level_support_community   high     
10           7 Level_career_prospects    very high
11           7 Level_career_satisfaction high     
12           7 Level_career_impact       high     
13           7 Level_collaboration       high     
14           7 Level_assessment_fairness high     
15           8 Level_support_colleagues  high     
16           8 Level_support_community   low      
17           8 Level_career_prospects    very low 
18           8 Level_career_satisfaction high     
19           8 Level_career_impact       high     
20           8 Level_collaboration       low      
21           8 Level_assessment_fairness low      
# … with 119 more rows
# ℹ Use `print(n = ...)` to see more rows

可以使用以下dput()输出重新创建:

structure(list(Response_ID = c(6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 
7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 
11L, 11L, 11L, 11L, 11L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 13L, 
13L, 13L, 13L, 13L, 13L, 13L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 
15L, 15L, 15L, 15L, 15L, 15L, 15L, 16L, 16L, 16L, 16L, 16L, 16L, 
16L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 18L, 18L, 18L, 18L, 18L, 
18L, 18L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 20L, 20L, 20L, 20L, 
20L, 20L, 20L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 22L, 22L, 22L, 
22L, 22L, 22L, 22L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 24L, 24L, 
24L, 24L, 24L, 24L, 24L, 25L, 25L, 25L, 25L, 25L, 25L, 25L), 
    Factor = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 
    3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 
    4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 
    5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 
    6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 
    7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 
    1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 
    2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 
    3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 
    4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L), levels = c("Level_support_colleagues", 
    "Level_support_community", "Level_career_prospects", "Level_career_satisfaction", 
    "Level_career_impact", "Level_collaboration", "Level_assessment_fairness"
    ), class = "factor"), Level = structure(c(2L, 2L, 2L, 3L, 
    2L, 3L, 3L, 3L, 3L, 4L, 3L, 3L, 3L, 3L, 3L, 2L, 1L, 3L, 3L, 
    2L, 2L, 4L, 3L, 2L, 3L, 3L, 3L, 2L, 4L, 3L, 3L, 4L, 3L, 4L, 
    3L, 2L, 2L, 3L, 3L, 2L, 2L, 2L, 3L, 3L, 2L, 3L, 3L, 3L, 3L, 
    3L, 2L, 3L, 3L, 2L, 4L, 2L, 2L, 4L, 2L, 4L, 4L, 1L, 3L, 3L, 
    3L, 3L, 3L, 4L, 3L, 3L, 2L, 3L, 3L, 3L, 3L, 2L, 1L, 3L, 3L, 
    1L, 3L, 2L, 4L, 3L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 3L, 3L, 3L, 
    4L, 2L, 2L, 2L, 4L, 3L, 2L, 3L, 3L, 4L, 3L, 3L, 3L, 2L, 3L, 
    3L, 3L, 2L, 3L, 3L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 3L, 
    3L, 2L, 4L, 4L, 3L, 4L, 3L, 4L, 2L, 3L, 2L, 2L, 3L, 1L, 2L, 
    2L), levels = c("very low", "low", "high", "very high"), class = "factor")), row.names = c(NA, 
-140L), class = c("tbl_df", "tbl", "data.frame"))

实际数据集有2000多行,代表400多个响应,这里的work_culture_data是20个调查响应的子集(20个唯一的Response_ID)(Level因子变量从"非常低"到"非常高")七个因子(Factor因子变量),例如,6号受访者认为他们的Level_career_prospectslow
基于work_culture_data,我想使用ggplot2创建一个100%堆积条形图,并具有以下功能:

  1. Factors在最终的图表中被重命名,例如从Level_career_prospects到"职业前景"。这将是垂直轴。
    1.堆叠条形图是水平的,我可以在其中指定其组件颜色。
    1.总共有七个堆叠的柱,每个柱代表Factor中的一个。
    1.堆叠柱由选择Level的受访者的*比例组成,按从"非常低"到"非常高"的顺序排列(共四个级别)。堆叠柱的每个部分代表Level中的一个。每个堆叠柱相加为100%。
    1.水平轴有三个带标签的断点:从左到右依次为0%、50%和100%。
    1.堆叠柱的 * 顺序 * 从上到下从"非常低"比例最高的柱到最低的柱。
    1.理想情况下,我希望得到所示堆叠条形图每个部分的响应 * count *。
    我试着从这句话开始创作这个情节:
work_culture_fig <- ggplot(work_culture_data, aes(y = Factor, x = Level)) + 
    geom_col()

然而,它给了我这个输出让我困惑:

我不知道从这里去哪里,非常困惑...是否应该先加宽tibble数据框?
我做错了什么?我怎样才能在最终数字中达到1~7以上?
谢谢你。

i5desfxk

i5desfxk1#

不知道我是否理解正确。

# Start by removing the "levels" from each word
t <- work_cultur_data$Factor
work_cultur_data$Factor <- gsub("Level_([a-z])", " \\U\\1", t, perl=TRUE)
work_cultur_data$Factor<- gsub("^([a-z])", "\\U\\1", t, perl=TRUE)
work_cultur_data$Factor <- str_to_title(str_trim( gsub("_", " ", t) )) 
work_cultur_data$Factor <- t

# Change levels
l <- work_cultur_data$Level
l <-  fct_relevel(work_cultur_data$Level,c("very high","high","low","very low"))
work_cultur_data$Level <- l

# Plot
work_cultur_data %>%   
    ggplot(aes(x=Factor,fill=Level))+
  geom_bar()+labs(fill="")+ylab("")+
  theme(axis.text.y = element_blank(),
        axis.ticks.y = element_blank())

更改了代码,使其为百分比。我还更正了轴,并使其更清楚地表明它应该是比例。

# Plot
work_cultur_data %>%   group_by(Factor,Level) %>% summarize(prop=n()) %>% 
    ggplot(aes(y=Factor,x=prop,fill=Level))+
  geom_col(position="fill")+labs(fill="")+ylab("")+
  scale_x_continuous(labels = scales::percent)
手动更改颜色:
scale_fill_manual(values=sample(colors(),
                                  length(unique(work_cultur_data$Level))))

这只是一种花哨的方法,每次都可以根据fill参数中的唯一级别来采样不同的颜色,你可以指定valuesc("red","#1CD317",colors()[444],"deeppink")--只是不同类型的颜色(十六进制代码、名称、所有可能命名颜色的索引或我最喜欢的颜色:深粉色!

oxiaedzo

oxiaedzo2#

我已经在ggplot之外改变了因子的顺序。你可以通过添加新的列而不是改变你现有的列来保持原来的特性。我也改变了,而不是使用四种手动颜色来设置调色板。R有非常好的着色选项,随着时间的推移,你可以让自己熟悉它们。如果你喜欢手动着色,使用前面答案中的行。

# Start by removing the "levels" from each word
t <- work_cultur_data$Factor
t <- gsub("Level_([a-z])", " \\U\\1", t, perl=TRUE)
t<- gsub("^([a-z])", "\\U\\1", t, perl=TRUE)
t <- str_to_title(str_trim( gsub("_", " ", t) )) 
work_cultur_data$Factor <- t

# Change levels
l <- work_cultur_data$Level
l <-  fct_relevel(work_cultur_data$Level,c("very high","high","low","very low"))
work_cultur_data$Level <- l

# Sort Factor by prop of Very low (Level)
arng_by <- work_cultur_data %>%
  filter(Level=="very low") %>% 
  group_by(Factor,Level) %>% 
  summarize(prop=n()) %>% arrange(prop) %>% pull(Factor)

f <- work_cultur_data$Factor
f <- fct_relevel(work_cultur_data$Factor,arng_by)
work_cultur_data$Factor <- f
# Plot
work_cultur_data %>%   group_by(Factor,Level) %>% summarize(prop=n()) %>% 
    ggplot(aes(y=Factor,x=prop,fill=Level))+
  geom_col(position="fill")+labs(fill="")+ylab("")+
  scale_x_continuous(labels = scales::percent)+
  scale_fill_brewer(palette = 11)

相关问题