我想计算由一组实验值定义的曲线下面积。我创建了一个函数,使用辛普森法则计算AUC的近似值,正如我在this post中看到的。然而,该函数仅在接收奇数长度的向量时有效。当输入向量具有偶数长度时,我如何修改代码以添加最后一个梯形的面积?
AUC <- function(x, h=1){
# AUC function computes the Area Under the Curve of a time serie using
# the Simpson's Rule (numerical method).
# https://link.springer.com/chapter/10.1007/978-1-4612-4974-0_26
# Arguments
# x: (vector) time serie values
# h: (int) temporal resolution of the time serie. default h=1
n = length(x)-1
xValues = seq(from=1, to=n, by=2)
sum <- list()
for(i in 1:length(xValues)){
n_sub <- xValues[[i]]-1
n <- xValues[[i]]
n_add <- xValues[[i]]+1
v1 <- x[[n_sub+1]]
v2 <- x[[n+1]]
v3 <- x[[n_add+1]]
s <- (h/3)*(v1+4*v2+v3)
sum <- append(sum, s)
}
sum <- unlist(sum)
auc <- sum(sum)
return(auc)
}
下面是一个数据示例:
smoothed = c(0.3,0.317,0.379,0.452,0.519,0.573,0.61,0.629,0.628,0.613,0.587,0.556,0.521,
0.485,0.448,0.411,0.363,0.317,0.273,0.227,0.185,0.148,0.12,0.103,0.093,0.086,
0.082,0.079,0.076,0.071,0.066,0.059,0.053,0.051,0.052,0.057,0.067,0.081,0.103,
0.129,0.165,0.209,0.252,0.292,0.328,0.363,0.398,0.431,0.459,0.479,0.491,0.494,
0.488,0.475,0.457,0.43,0.397,0.357,0.316,0.285,0.254,0.227,0.206,0.189,0.181,
0.171,0.157,0.151,0.162,0.192,0.239)
3条答案
按热度按时间bbmckpt71#
处理偶数个点而仍然达到精度的一种推荐方法是将辛普森1/3法则和辛普森3/8法则结合起来,后者可以处理偶数个点,这种方法可以在(至少一本或多本)关于数值方法的工程教科书中找到。
然而,实际上,你可以写一段代码来检查数据长度,并在末尾添加一个梯形,就像你链接到的帖子的最后一条评论中建议的那样。我不认为这一定像辛普森的1/3和3/8规则一样精确,但对许多应用程序来说可能是合理的。
我会仔细检查下面的代码编辑,但这是基本思想。
xpszyzbs2#
您可能有充分的理由选择使用辛普森规则,但如果您只是想快速有效地估计AUC,梯形规则更容易实现,并且不需要偶数个断点:
nimxete23#
在这里,我展示了一个示例代码,它使用Simpson的1/3和3/8规则对数据进行数值积分。通常,关于编码错误或兼容性问题的可能性的警告适用。
最后的输出比较了该算法的数值估计与使用R的“积分”函数的梯形法则。
下一段代码显示了性能比较。这段代码可以很容易地针对不同的测试函数和用例进行修改。
精密度差异往往随样本量和所用检验函数而变化;这个例子并不意味着差别总是如此明显。