R语言 使用ggplot创建散点图修改

svmlkihl  于 2022-12-27  发布在  其他
关注(0)|答案(1)|浏览(158)

我有以下数据集

CombinedScores<-c("zero", "5", "10", "15", "20", "25", "30", "35", "40", 
                                    "45", "50", "60", "G", "LG", "GF",
"FER", 
                                  "VAR", "LOPE", "DAR", "CCOR", 
                                    "LTR", "Ideal")
FalsePositiveRate<-c(0, 0.04, 0.07, 0.1, 0.18, 0.26, 0.4, 0.5, 0.6, 0.71, 0.8, 0.96, 0.1, 0.26, 
                    0.07, 0.49, 0.07, 0.26, 0.05, 0.28, 0.03, 1 )
TruePositiveRate<-c(0, 0.4, 0.4, 0.53, 0.8, 0.8, 0.92, 1, 1, 1, 1, 1, 
                    0.53, 0.8, 0.47, 0.93, 0.4, 0.8, 0.6, 0.8, 0.8, 1)
MetricOrAlternate<-c("Metric", "Alternate", "Alternate", "Alternate", "Alternate", "Alternate", "Alternate", "Alternate", "Alternate", 
                  "Alternate", "Alternate", "Alternate", "Metric", "Metric", "Metric", "Metric", 
                  "Metric", "Metric", "Metric", "Metric", 
                  "Metric", "Metric")

COMBINEDTABLE<-data.frame(CombinedScores, FalsePositiveRate, TruePositiveRate, MetricOrAlternate)

我正在尝试创建一个图表,将假阳性率作为x轴,将真阳性率作为y轴,x轴和y轴的范围均为0到1。如果是"公制"或"备用",我还希望用颜色编码,并带有清晰美观的图例。同样,我希望绘制一条黄色线连接以下点(0,0.5)和(0.5,1),并仅标记位于左上角且高于此线的那些分数。
我创建了以下代码:

ggplot(COMBINEDTABLE, aes(FalsePositiveRate, TruePositiveRate)) + 
  geom_point() + coord_cartesian(xlim=c(0,1), ylim=c(0, 1))+
  geom_text_repel(label=CombinedScores, nudge_y = 0.02, nudge_x = 0.02, min.segment.length = 5)

但由于某些原因,我不能添加颜色代码标签来标记每一个点,如果它是一个度量或替代,不能添加连接(0,0.5)和(0.5,1)的线,类似地,所以它只有一个特定的分数标签,为那些落在线以上。
提前感谢所有能提供帮助的人。

mwkjh3gx

mwkjh3gx1#

根据Metric添加颜色意味着在ggplot调用的aes部分添加颜色。添加直线是对geom_abline的调用,要检查点是在直线之上还是之下,我们可以使用ifelse语句。我还可以自由地为这个图形添加coord_fixed

combined <- structure(list(CombinedScores = c("zero", "5", "10", "15", "20", 
                       "25", "30", "35", "40", "45", "50", "60", "G", "LG", "GF", "FER", 
                       "VAR", "LOPE", "DAR", "CCOR", "LTR", "Ideal"), FalsePositiveRate = c(0, 
                       0.04, 0.07, 0.1, 0.18, 0.26, 0.4, 0.5, 0.6, 0.71, 0.8, 0.96, 
                       0.1, 0.26, 0.07, 0.49, 0.07, 0.26, 0.05, 0.28, 0.03, 1), TruePositiveRate = c(0, 
                       0.4, 0.4, 0.53, 0.8, 0.8, 0.92, 1, 1, 1, 1, 1, 0.53, 0.8, 0.47, 
                       0.93, 0.4, 0.8, 0.6, 0.8, 0.8, 1), MetricOrAlternate = c("Metric", 
                       "Alternate", "Alternate", "Alternate", "Alternate", "Alternate", 
                       "Alternate", "Alternate", "Alternate", "Alternate", "Alternate", 
                       "Alternate", "Metric", "Metric", "Metric", "Metric", "Metric", 
                       "Metric", "Metric", "Metric", "Metric", "Metric")), 
                       class = "data.frame", row.names = c(NA, -22L))

library(ggplot2)
library(ggrepel)

ggplot(combined, aes(x = FalsePositiveRate, y = TruePositiveRate, color = MetricOrAlternate)) + 
  geom_abline(slope = 1, intercept = .5, lwd = 2.5, color = "grey") +
  geom_point(size =4, alpha = .8) + 
  coord_cartesian(xlim=c(0,1), ylim=c(0, 1)) +
  coord_fixed() +
  geom_text_repel(label = ifelse(TruePositiveRate > .5 + FalsePositiveRate,
                                 yes = CombinedScores, no = ""), 
                  box.padding = 0.5)

如果您计划做更多类似的工作,请考虑查看pROC包。

相关问题