假设我有下面的模拟数据:
df <- structure(list(country = c("Italy", "Italy", "Italy", "Italy",
"Italy", "Italy", "Italy", "Italy", "Italy", "Austria", "Austria",
"Austria", "Austria", "Austria", "Austria", "Austria", "Austria",
"Austria", "Germany", "Germany", "Germany", "Germany", "Germany",
"Germany", "Germany", "Germany", "Germany"), date = c(1000, 1200,
1300, 1400, 1500, 1600, 1700, 1750, 1800, 1000, 1200, 1300, 1400,
1500, 1600, 1700, 1750, 1800, 1000, 1200, 1300, 1400, 1500, 1600,
1700, 1750, 1800), X = c(1, 3, 3, 3, 3, 2, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), fit.0.025quant = c(0.828491623257632,
2.44382034612151, 2.62478810453802, 2.6337062129497, 2.53478842319723,
1.65013735592875, 0.752635464354321, 0.662634760040262, 0.63361902939496,
0.631636426187967, 0.64989986512676, 0.650190687845941, 0.65019680371168,
0.650196959453982, 0.650196803680857, 0.650190687868955, 0.64989986513554,
0.631636426203261, 0.631636426187967, 0.64989986512676, 0.650190687845941,
0.65019680371168, 0.650196959453982, 0.650196803680857, 0.650190687868955,
0.64989986513554, 0.631636426203261), fit.mean = c(1.19020910017421,
2.83008460257499, 2.98078335406727, 2.98765164881975, 2.90462557038579,
1.99996297933676, 1.09510738941705, 1.01024668622426, 1.00139011636246,
1.00000323935823, 1.0000015167136, 1.00000147625773, 1.00000147475972,
1.00000147470401, 1.00000147476084, 1.00000147625788, 1.00000151671303,
1.00000323935854, 1.00000323935823, 1.0000015167136, 1.00000147625773,
1.00000147475972, 1.00000147470401, 1.00000147476084, 1.00000147625788,
1.00000151671303, 1.00000323935854), fit.0.975quant = c(1.61097812296572,
3.17395283857941, 3.326527054585, 3.33442175855909, 3.24708594348506,
2.3497438234016, 1.46469784405916, 1.36349195359207, 1.37038568954042,
1.36836869801455, 1.35010013487928, 1.34980931215949, 1.34980319628983,
1.34980304054122, 1.34980319632293, 1.34980931213498, 1.35010013486874,
1.36836869799899, 1.36836869801455, 1.35010013487928, 1.34980931215949,
1.34980319628983, 1.34980304054122, 1.34980319632293, 1.34980931213498,
1.35010013486874, 1.36836869799899)),
class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -27L))
我有两个相关的问题。
我可以将df绘制为:
ggplot(df,
aes(x = X, y = fit.mean)) +
geom_pointrange(aes(ymin = fit.0.025quant, ymax = fit.0.975quant),
color = "red") +
xlab("Observed values") +
ylab("Fitted values") +
scale_y_continuous(breaks=seq(-1,7)) +
scale_x_continuous(breaks=seq(1,7,1))
它生产:
问题1:对于“1”和“3”,它显示了几个平均值(大点)。我希望每个平均值只显示一个平均值,中间的那个。它是“1”的所有“fit.mean”的平均值。
类似于“errorbars”的问题:
ggplot(df,
aes(x = X, y = fit.mean)) +
geom_errorbar(aes(ymin = fit.0.025quant, ymax = fit.0.975quant),
color = "red") +
xlab("Observed values") +
ylab("Fitted values") +
scale_y_continuous(breaks=seq(-1,7)) +
scale_x_continuous(breaks=seq(1,7,1))
它生产:
问题2:对于“1”和“3”,我希望每个都只显示上下条(红色水平线)。
1条答案
按热度按时间bxjv4tth1#
在我看来,最简单的方法是计算所有
fit.mean
的平均值,依此类推,如下所示第二种方式:要绘制q25的最小值和q95的最大值,您只需更改
mutate
参数。购买你可以做一些像
mutate(mean_all=mean(fit.mean))
在这里你保存的适合.意味着并产生一个新的变量mean_all,你将调用在你的ggplot 2美学参数使用DF 2,您的图很好
创建于2023-04-04带有reprex v2.0.2