R语言 ggplot2 stat_function()未绘制精确的曲线下面积

gkn4icbw  于 2023-01-18  发布在  其他
关注(0)|答案(2)|浏览(179)

我试图画出t分布尾下的面积,就像这个例子一样,但是,对于某些自由度,它没有画出我想要的面积。垂直线标记了较低的t临界值,所以我希望geom_area着色到线上。例如,我已经尝试了4,9,99和999自由度,但只有4和999 df工作,见所附图片:

下面是代码。

alpha=0.1
n=5

l.critical = qt(alpha,df=n-1)
u.critical = -l.critical

# function to shade lower tail
funcShaded <- function(x) {
  y <- dt(x,df=n-1)
  y[x>l.critical]<-NA
  return(y)
}

ggplot(data.frame(x = c(l.critical-3,u.critical+3)), aes(x = x)) +
  stat_function(fun = dt,
                args = list(df=n-1),linewidth=1)+
  scale_x_continuous(name = "t values")+
  stat_function(fun=funcShaded, geom="area", fill="#84CA72", alpha=1,
                outline.type="full",color="black")+
  theme(axis.text.y = element_blank(),
        axis.ticks.y = element_blank())+
  labs(y="")+
  geom_vline(xintercept=l.critical)

我怀疑问题可能出在y[x>l.critical]<-NA行,在该行中,我用NA替换了高于临界值下限的y值,即Upper tail,因为stat_function生成的x值可能()不包括我的下临界值,这将导致这样一种情况,其中对于x,未被替换的最高值小于下临界值,因为这个原因我们最后得到了这个,如果这是原因,有没有办法强制我的下临界值在生成的x值中?

s8vozzvw

s8vozzvw1#

stat_function具有一个参数n,该参数确定沿着曲线计算的值的数量。将此参数设置为一个较大的数字(例如1000),误差将消失。例如,在99个自由度的情况下,默认图如下所示:

ggplot(data.frame(x = c(l.critical - 3, u.critical + 3)), aes(x)) +
  stat_function(fun = funcShaded, geom = "area", fill = "#84CA72") +
  stat_function(fun = dt, args = list(df = n - 1), linewidth = 1) +
  geom_vline(xintercept = l.critical) +
  scale_x_continuous(name = "t values") +
  theme(axis.text.y  = element_blank(),
        axis.ticks.y = element_blank(),
        axis.title.y = element_blank())

但是如果我们把n = 1000加到stat_function上,对齐就完美了:

ggplot(data.frame(x = c(l.critical - 3, u.critical + 3)), aes(x)) +
  stat_function(fun = funcShaded, geom = "area", fill = "#84CA72", n = 1000) +
  stat_function(fun = dt, args = list(df = n - 1), linewidth = 1, n = 1000) +
  geom_vline(xintercept = l.critical) +
  scale_x_continuous(name = "t values") +
  theme(axis.text.y  = element_blank(),
        axis.ticks.y = element_blank(),
        axis.title.y = element_blank())

t3psigkw

t3psigkw2#

要解决您的问题,您可以通过xlim设置stat_function中的限制,同时允许摆脱您的funcShaded

alpha <- 0.1
n <- 99

l.critical <- qt(alpha, df = n - 1)
u.critical <- -l.critical

library(ggplot2)

ggplot(data.frame(x = c(l.critical - 3, u.critical + 3)), aes(x = x)) +
  stat_function(
    fun = dt,
    args = list(df = n - 1), linewidth = 1
  ) +
  scale_x_continuous(name = "t values") +
  stat_function(
    fun = dt, geom = "area", fill = "#84CA72", alpha = 1,
    outline.type = "full", color = "black", xlim = c(l.critical - 3, l.critical),
    args = list(df = n - 1), 
  ) +
  theme(
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank()
  ) +
  labs(y = "") +
  geom_vline(xintercept = l.critical)

相关问题