R中第10和第90百分位数的箱形图和附加线

wd2eg0qa  于 2023-03-27  发布在  其他
关注(0)|答案(1)|浏览(291)

不幸的是,我在R语言中不是很有经验,但我需要解决一个问题,这对我来说似乎很难,但如果一个人知道如何在R语言中使用箱线图,可能很容易。
我需要在分组箱线图中添加额外的水平线或点,用于第10和第90百分位数**。除此之外,箱线图应包含常见的特征,如最小值,最大值,通常的第25百分位数,中位数和第75百分位数以及离群值。
我尝试了几个解决方案,但都不适合我的情况。一个有希望的尝试是类似于下面的解决方案,写一个函数-但我需要的是中位数而不是平均值,除此之外,我还需要显示第10和第90百分位数。此外,按变量 Col 对框进行分组很重要(参见下面的示例代码):
如果你能给予我一些解决这个问题的方法,我将不胜感激!

dataset_stack <- structure(list(Col = c("Blue", "Blue", "Blue", "Blue", "Blue", 
                       "Blue", "Blue", "Blue", "Blue", "Blue", "Blue", "Blue", "Blue", 
                       "Green", "Green", "Green", "Green", "Green", "Green", "Green", 
                       "Green", "Green", "Green", "Green", "Green", "Green", "Green", 
                       "Green", "Red", "Red", "Red", "Red", "Red", "Red", "Red", "Red", 
                       "Red", "Red", "Red", "Red", "Red", "Red", "Red"), TTC = c(0.9, 
                                                                                 0.7, 0, 0.1, 0.1, 0.4, 0.9, 0.8, 0.1, 0, 0.7, 0.2, 0.7, 0.2, 
                                                                                 0, 0.8, 0.7, 0.8, 0.9, 0.3, 0.9, 0.8, 0.3, 1, 0.6, 0.4, 0.3, 
                                                                                 0.3, 0.3, 0.2, 0.2, 0.7, 0.9, 0.9, 0.6, 0.4, 0.1, 0.4, 0.8, 0, 
                                                                                 0.7, 0.4, 0.7)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
                                                                                                                                                          -43L))

bp.vals <- function(x, probs=c(0.1, 0.25, 0.75, .9)) {
    r <- quantile(x, probs=probs , na.rm=TRUE)
    r = c(r[1:2], exp(mean(log(x))), r[3:4])
    names(r) <- c("ymin", "lower", "middle", "upper", "ymax")
    r
  }

  # Sample usage of the function with the built-in mtcars data frame
  ggplot(dataset_stack, aes(x=factor(Col), y=TTC)) +
    stat_summary(fun.data=bp.vals, geom="boxplot")
ogq8wdun

ogq8wdun1#

您可以使用stat_summary()函数并添加fun()来指示特定的quantile()median作为彩色点。如果您的数据包含离群值,则它们将以橙子显示:

ggplot(dataset_stack, aes(x=factor(Col), y=TTC)) +
     geom_boxplot(outlier.color = "orange3", outlier.size = 4) + 
     stat_summary(fun.y="median", geom="point", shape=16, size=4, color="darkred") +
     stat_summary(geom = "point", fun = \(x) quantile(x, 0.1,na.rm=T),shape=16, size=4,color="red")+
     stat_summary(geom = "point", fun = \(x) quantile(x, 0.9,na.rm=T),shape=16, size=4,color="blue")+
     theme_bw()

如果你只想显示例如,黑色的mean,深红色的medianminmax值,例如grey颜色,你可以使用函数stat_summary()

ggplot(dataset_stack, aes(x=factor(Col), y=TTC)) +
  geom_boxplot(outlier.color = "orange3", outlier.size = 4) + 
  stat_summary(fun.y="mean", geom="point", shape=16, size=4, color="black") +
  stat_summary(fun.y="median", geom="point", shape=16, size=4, color="darkred") +
  stat_summary(fun.y="min", geom="point", shape=16, size=4, color="grey") +
  stat_summary(fun.y="max", geom="point", shape=16, size=4, color="grey") +
  theme_bw()

把所有加在一起:

ggplot(dataset_stack, aes(x=factor(Col), y=TTC)) +
  geom_boxplot(outlier.color = "orange3", outlier.size = 4) + 
  stat_summary(fun.y="mean", geom="point", shape=16, size=4, color="black") +
  stat_summary(fun.y="median", geom="point", shape=16, size=4, color="darkred") +
  stat_summary(fun.y="min", geom="point", shape=16, size=4, color="grey") +
  stat_summary(fun.y="max", geom="point", shape=16, size=4, color="grey") +
  stat_summary(geom = "point", fun = \(x) quantile(x, 0.1,na.rm=T),shape=16, size=4,color="red")+
  stat_summary(geom = "point", fun = \(x) quantile(x, 0.9,na.rm=T),shape=16, size=4,color="blue")+
  theme_bw()

相关问题