当有多个组时使用geom_pointrange和geom_errorbar

z6psavjg  于 2023-04-09  发布在  其他
关注(0)|答案(1)|浏览(119)

假设我有下面的模拟数据:

df <- structure(list(country = c("Italy", "Italy", "Italy", "Italy", 
                           "Italy", "Italy", "Italy", "Italy", "Italy", "Austria", "Austria", 
                           "Austria", "Austria", "Austria", "Austria", "Austria", "Austria", 
                           "Austria", "Germany", "Germany", "Germany", "Germany", "Germany", 
                           "Germany", "Germany", "Germany", "Germany"), date = c(1000, 1200, 
                                                                                 1300, 1400, 1500, 1600, 1700, 1750, 1800, 1000, 1200, 1300, 1400, 
                                                                                 1500, 1600, 1700, 1750, 1800, 1000, 1200, 1300, 1400, 1500, 1600, 
                                                                                 1700, 1750, 1800), X = c(1, 3, 3, 3, 3, 2, 1, 1, 1, 1, 1, 1, 
                                                                                                          1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), fit.0.025quant = c(0.828491623257632, 
                                                                                                                                                                           2.44382034612151, 2.62478810453802, 2.6337062129497, 2.53478842319723, 
                                                                                                                                                                           1.65013735592875, 0.752635464354321, 0.662634760040262, 0.63361902939496, 
                                                                                                                                                                           0.631636426187967, 0.64989986512676, 0.650190687845941, 0.65019680371168, 
                                                                                                                                                                           0.650196959453982, 0.650196803680857, 0.650190687868955, 0.64989986513554, 
                                                                                                                                                                           0.631636426203261, 0.631636426187967, 0.64989986512676, 0.650190687845941, 
                                                                                                                                                                           0.65019680371168, 0.650196959453982, 0.650196803680857, 0.650190687868955, 
                                                                                                                                                                           0.64989986513554, 0.631636426203261), fit.mean = c(1.19020910017421, 
                                                                                                                                                                                                                              2.83008460257499, 2.98078335406727, 2.98765164881975, 2.90462557038579, 
                                                                                                                                                                                                                              1.99996297933676, 1.09510738941705, 1.01024668622426, 1.00139011636246, 
                                                                                                                                                                                                                              1.00000323935823, 1.0000015167136, 1.00000147625773, 1.00000147475972, 
                                                                                                                                                                                                                              1.00000147470401, 1.00000147476084, 1.00000147625788, 1.00000151671303, 
                                                                                                                                                                                                                              1.00000323935854, 1.00000323935823, 1.0000015167136, 1.00000147625773, 
                                                                                                                                                                                                                              1.00000147475972, 1.00000147470401, 1.00000147476084, 1.00000147625788, 
                                                                                                                                                                                                                              1.00000151671303, 1.00000323935854), fit.0.975quant = c(1.61097812296572, 
                                                                                                                                                                                                                                                                                      3.17395283857941, 3.326527054585, 3.33442175855909, 3.24708594348506, 
                                                                                                                                                                                                                                                                                      2.3497438234016, 1.46469784405916, 1.36349195359207, 1.37038568954042, 
                                                                                                                                                                                                                                                                                      1.36836869801455, 1.35010013487928, 1.34980931215949, 1.34980319628983, 
                                                                                                                                                                                                                                                                                      1.34980304054122, 1.34980319632293, 1.34980931213498, 1.35010013486874, 
                                                                                                                                                                                                                                                                                      1.36836869799899, 1.36836869801455, 1.35010013487928, 1.34980931215949, 
                                                                                                                                                                                                                                                                                      1.34980319628983, 1.34980304054122, 1.34980319632293, 1.34980931213498, 
                                                                                                                                                                                                                                                                                      1.35010013486874, 1.36836869799899)), 
  class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -27L))

我有两个相关的问题。
我可以将df绘制为:

ggplot(df, 
       aes(x = X, y = fit.mean)) +
  geom_pointrange(aes(ymin = fit.0.025quant, ymax = fit.0.975quant),
                  color = "red") + 
  xlab("Observed values") + 
  ylab("Fitted values") + 
  scale_y_continuous(breaks=seq(-1,7)) + 
  scale_x_continuous(breaks=seq(1,7,1))

它生产:

问题1:对于“1”和“3”,它显示了几个平均值(大点)。我希望每个平均值只显示一个平均值,中间的那个。它是“1”的所有“fit.mean”的平均值。
类似于“errorbars”的问题:

ggplot(df, 
       aes(x = X, y = fit.mean)) +
  geom_errorbar(aes(ymin = fit.0.025quant, ymax = fit.0.975quant), 
                color = "red") + 
  xlab("Observed values") + 
  ylab("Fitted values") + 
  scale_y_continuous(breaks=seq(-1,7)) + 
  scale_x_continuous(breaks=seq(1,7,1))

它生产:

问题2:对于“1”和“3”,我希望每个都只显示上下条(红色水平线)。

bxjv4tth

bxjv4tth1#

在我看来,最简单的方法是计算所有fit.mean的平均值,依此类推,如下所示

df2=df%>%
  group_by(X)%>%
  mutate(across(fit.0.025quant:fit.0.975quant, ~ mean(.x)))

第二种方式:要绘制q25的最小值和q95的最大值,您只需更改mutate参数。

购买你可以做一些像mutate(mean_all=mean(fit.mean))在这里你保存的适合.意味着并产生一个新的变量mean_all,你将调用在你的ggplot 2美学参数

df2=df%>%
  group_by(X)%>%
  mutate(mean = mean(fit.mean),
         fit.0.025quant=min(fit.0.025quant),
         fit.0.975quant=max(fit.0.975quant))

使用DF 2,您的图很好

ggplot(df2, 
       aes(x = X, y = fit.mean)) +
  geom_pointrange(aes(ymin = fit.0.025quant, ymax = fit.0.975quant),
                  color = "red") + 
  xlab("Observed values") + 
  ylab("Fitted values") + 
  scale_y_continuous(breaks=seq(-1,7)) + 
  scale_x_continuous(breaks=seq(1,7,1))

ggplot(df2, 
       aes(x = X, y = fit.mean)) +
  geom_errorbar(aes(ymin = fit.0.025quant, ymax = fit.0.975quant), 
                color = "red") + 
  xlab("Observed values") + 
  ylab("Fitted values") + 
  scale_y_continuous(breaks=seq(-1,7)) + 
  scale_x_continuous(breaks=seq(1,7,1))

创建于2023-04-04带有reprex v2.0.2

相关问题