R语言 创建分组的Cleveland图

3pmvbmvn  于 2023-04-09  发布在  其他
关注(0)|答案(1)|浏览(147)

我想做一个克利夫兰风格的图表,可以比较男性和女性在不同年龄表现出不同行为的概率,我有一个类似的数据集:

Data <- data.frame(
    skill = c("Writes Name", "Reads 10 Words"),
    age15male = c(5, 10),
    age30male = c(6, 11),
    age45male = c(7, 12),
        age60male = c(8, 13),
    age75male = c(9, 14),
    age90male = c(10, 15),
    age15female = c(4, 9),
    age30female = c(5, 10),
        age45female = c(6, 11),
        age60female = c(7, 12),
    age75female = c(8, 13),
    age90female = c(9, 14)
)

我可以制作我想要的条形图,但一次只能显示一种性别,使用以下代码:

colors = c("15% - 30%" = "#d8ebf2",
           "30% - 45%" = "#96c6d9",
           "45% - 60%" = "#50a0bf",
           "60% - 75%" = "#025373",
           "75% - 90%" = "#023859")
ggplot(Data) +
#male
  geom_segment(aes(x=skill, xend=skill, y=age15male, yend=age30male, color = "15% - 30%"), linewidth=8) +
  geom_segment(aes(x=skill, xend=skill, y=age30male, yend=age45male, color = "30% - 45%"), linewidth=8) +
  geom_segment(aes(x=skill, xend=skill, y=age45male, yend=age60male, color = "45% - 60%"), linewidth=8) +
  geom_segment(aes(x=skill, xend=skill, y=age60male, yend=age75male, color = "60% - 75%"), linewidth=8) +
  geom_segment(aes(x=skill, xend=skill, y=age75male, yend=age90male, color = "75% - 90%"), linewidth=8) +
 coord_flip()+
 scale_color_manual(values = colors) +
  theme_gray(base_size = 14) +
  theme(axis.title.y =  element_text(margin=margin(t=0,r=10,b=0,l=0)))+
  labs(x = "Milestone", y = "Age (years)", color = "Probability")

但是,我想同时显示两种性别的值,每个技能的值都在旁边,如本例所示:

我的第一个方法是显示男性和女性的所有部分,但他们当然重叠,不创建单独的线.我的下一个方法是偏移每个geom段男性是上面一点,女性是下面一点:

geom_segment(aes(x=skill+0.5, xend=skill+0.5, y=age15male, yend=age30male, color = "15% - 30%"), linewidth=8)
geom_segment(aes(x=skill-0.5, xend=skill-0.5, y=age15male, yend=age30female, color = "15% - 30%"), linewidth=8)

但我得到了错误**“离散值提供给连续规模”。**我认为,如果我可以把男性部分较高和女性部分较低,我可以显示两种性别的数据。然而,我仍然需要一个不同的颜色梯度和图例为每种性别,我不知道如何处理这一点。任何帮助,例子,或资源是非常感谢。我道歉,如果有类似的线程,但我自己找不到类似的东西。

mzmfm0qo

mzmfm0qo1#

我不能重现你的问题。但是,当我使用你的代码时,我得到一个错误
二元运算符的非数值变元
因为我们不能在字符skill上添加数字。
但恕我直言,你的一般想法,以转移您的部分是正确的。然而,我会去一个geom_rect代替,因为它使它更容易设置宽度的酒吧,然后依靠linewidth。然而,这样做需要一些手工工作,并将您的字符转换为数字。2为此,第一步,我将您的数据重塑为整洁的格式。最后,我通过ggnewscale包添加了第二个填充比例:

library(tidyverse)

dat_long <- Data |>
  tidyr::pivot_longer(-skill, values_to = "age") |>
  tidyr::separate_wider_regex(name, patterns = c("^age", prob = "\\d+", sex = ".*$")) |>
  dplyr::mutate(
    xend = lead(age),
    prob = paste0(prob, "% - ", lead(prob, default = "100"), "%"),
    .by = sex
  ) |>
  dplyr::mutate(
    skill_num = as.numeric(factor(skill)),
    sex_num = scales::rescale(as.numeric(factor(sex)), to = c(-1, 1)),
    ymin = skill_num + .05 * sex_num,
    ymax = skill_num + .35 * sex_num
  )

colors_female <- c("#FEE5D9", "#FCAE91", "#FB6A4A", "#DE2D26", "#A50F15")
names(colors_female) <- names(colors)

ggplot(dat_long) +
  geom_rect(data = ~ subset(.x, sex == "male"), aes(ymin = ymin, ymax = ymax, xmin = age, xmax = xend, fill = factor(prob))) +
  scale_fill_manual(values = colors, name = "male") +
  ggnewscale::new_scale_fill() +
  geom_rect(data = ~ subset(.x, sex != "male"), aes(ymin = ymin, ymax = ymax, xmin = age, xmax = xend, fill = factor(prob))) +
  scale_fill_manual(values = colors_female, name = "female") +
  scale_y_continuous(breaks = 1:2, labels = unique(Data$skill)) +
  theme_gray(base_size = 14) +
  theme(axis.title.y = element_text(margin = margin(t = 0, r = 10, b = 0, l = 0))) +
  labs(y = "Milestone", x = "Age (years)", color = "Probability")

相关问题