使用R中Simpson法则的曲线下面积

wpx232ag  于 2023-01-03  发布在  其他
关注(0)|答案(3)|浏览(308)

我想计算由一组实验值定义的曲线下面积。我创建了一个函数,使用辛普森法则计算AUC的近似值,正如我在this post中看到的。然而,该函数仅在接收奇数长度的向量时有效。当输入向量具有偶数长度时,我如何修改代码以添加最后一个梯形的面积?

AUC <- function(x, h=1){
  # AUC function computes the Area Under the Curve of a time serie using
  # the Simpson's Rule (numerical method).
  # https://link.springer.com/chapter/10.1007/978-1-4612-4974-0_26
  
  # Arguments
  # x: (vector) time serie values
  # h: (int) temporal resolution of the time serie. default h=1
  
  n = length(x)-1
  
  xValues = seq(from=1, to=n, by=2)
  
  sum <- list()
  for(i in 1:length(xValues)){
    n_sub <- xValues[[i]]-1
    n <- xValues[[i]]
    n_add <- xValues[[i]]+1
    
    v1 <- x[[n_sub+1]]
    v2 <- x[[n+1]]
    v3 <- x[[n_add+1]]
    
    s <- (h/3)*(v1+4*v2+v3)
    sum <- append(sum, s)
  }
  sum <- unlist(sum)
  
  auc <- sum(sum)
  
  return(auc)
  
}

下面是一个数据示例:

smoothed = c(0.3,0.317,0.379,0.452,0.519,0.573,0.61,0.629,0.628,0.613,0.587,0.556,0.521,
             0.485,0.448,0.411,0.363,0.317,0.273,0.227,0.185,0.148,0.12,0.103,0.093,0.086,
             0.082,0.079,0.076,0.071,0.066,0.059,0.053,0.051,0.052,0.057,0.067,0.081,0.103,
             0.129,0.165,0.209,0.252,0.292,0.328,0.363,0.398,0.431,0.459,0.479,0.491,0.494,
             0.488,0.475,0.457,0.43,0.397,0.357,0.316,0.285,0.254,0.227,0.206,0.189,0.181,
             0.171,0.157,0.151,0.162,0.192,0.239)
bbmckpt7

bbmckpt71#

处理偶数个点而仍然达到精度的一种推荐方法是将辛普森1/3法则和辛普森3/8法则结合起来,后者可以处理偶数个点,这种方法可以在(至少一本或多本)关于数值方法的工程教科书中找到。
然而,实际上,你可以写一段代码来检查数据长度,并在末尾添加一个梯形,就像你链接到的帖子的最后一条评论中建议的那样。我不认为这一定像辛普森的1/3和3/8规则一样精确,但对许多应用程序来说可能是合理的。
我会仔细检查下面的代码编辑,但这是基本思想。

AUC <- function(x, h=1){
  # AUC function computes the Area Under the Curve of a time serie using
  # the Simpson's Rule (numerical method).
  # https://link.springer.com/chapter/10.1007/978-1-4612-4974-0_26
  
  # Arguments
  # x: (vector) time serie values
  # h: (int) temporal resolution of the time serie. default h=1
  
  #jh edit: check for even data length
  #and chop off last data point if even
  nn = length(x)
  if(length(x) %% 2 == 0){
  xlast = x[length(x)]
  x = x[-length(x)]
  }

  n = length(x)-1
  
  xValues = seq(from=1, to=n, by=2)
  
  sum <- list()
  for(i in 1:length(xValues)){
    n_sub <- xValues[[i]]-1
    n <- xValues[[i]]
    n_add <- xValues[[i]]+1
    
    v1 <- x[[n_sub+1]]
    v2 <- x[[n+1]]
    v3 <- x[[n_add+1]]
    
    s <- (h/3)*(v1+4*v2+v3)
    sum <- append(sum, s)
  }
  sum <- unlist(sum)
  
  auc <- sum(sum)
  ##jh edit: add trapezoid for last two data points to result
  if(nn %% 2 == 0){
  auc <- auc + (x[length(x)] + xlast)/2 * h
  }
  
  
  return(auc)
  
}

sm = smoothed[-length(smoothed)]
length(sm)
[1] 70
#even data as an example
AUC(sm)
[1] 20.17633
#original odd data
AUC(smoothed)
[1] 20.389
xpszyzbs

xpszyzbs2#

您可能有充分的理由选择使用辛普森规则,但如果您只是想快速有效地估计AUC,梯形规则更容易实现,并且不需要偶数个断点:

AUC <- function(x, h = 1) sum((x[-1] + x[-length(x)]) / 2 * h)

AUC(smoothed)
#> [1] 20.3945
nimxete2

nimxete23#

在这里,我展示了一个示例代码,它使用Simpson的1/3和3/8规则对数据进行数值积分。通常,关于编码错误或兼容性问题的可能性的警告适用。
最后的输出比较了该算法的数值估计与使用R的“积分”函数的梯形法则。

#Algorithm adapted from:
#Numerical Methods for Engineers, Seventh Edition, 
#By Chapra and Canale, page 623
#Modified to accept data instead of functional values
#Modified by: Jeffrey Harkness, M.S.

##Begin Simpson's rule function code

simp13 <- function(dat, h = 1){
ans = 2*h*(dat[1] + 4*dat[2] + dat[3])/6
return(ans)}

simp13m <- function(dat, h = 1){
summ <- dat[1]
n <- length(dat)
nseq <- seq(2,(n-2),2)
for(i in nseq){
summ <- summ + 4*dat[i] + 2*dat[i+1]}
summ <- summ + 4*dat[n-1] + dat[n]
result <- (h*summ)/3
return(result)}

simp38 <- function(dat, h = 1){
ans <-  3*h*(dat[1] + 3*sum(dat[2:3]) + dat[4])/8
return(ans)}

simpson = function(dat, h = 1){
hin = h
len = length(dat)
comp <- len %% 2
##number of segments
if(len == 2){
ans = sum(dat)/2*h} ##n = 2 is the trapezoidal rule
if(len == 3){
ans = simp13(dat, h = hin)}
if(len == 4){
ans = simp38(dat,h = hin)}
if(len == 6){
ans <- simp38(dat[1:4],h = hin) + simp13(dat[4:len],h = hin)}
if(len > 6 & comp == 0){
ans = simp38(dat[1:4],h = hin) + simp13m(dat[4:len],h = hin)}
if(len >= 5 & comp == 1){
ans = simp13m(dat,h = hin)}
return(ans)}

##End Simpson's rule function code

下一段代码显示了性能比较。这段代码可以很容易地针对不同的测试函数和用例进行修改。
精密度差异往往随样本量和所用检验函数而变化;这个例子并不意味着差别总是如此明显。

#other algorithm for comparison purposes, from Allan Cameron above
oa <- function(x, h = 1) sum((x[-1] + x[-length(x)]) / 2 * h)
 
#Testing and algorithm comparison code
simans = NULL; oaans = NULL; simerr = NULL; oaerr = NULL; mp = NULL
for( j in 1:10){
n = j
#f = function(x) cos(x) + 2   ##Test functions
f = function(x) 0.2 + 25*x - 200*x^2 + 675*x^3 - 900*x^4 + 400*x^5
a = 0;b = 10
h = (b-a)/n
datain = seq(a,b,by = h)
 
preans = integrate(f,a,b)$value #precise numerical estimate of test function
simans[j] = simpson(f(datain), h = h)
oaans[j] = oa(f(datain), h = h)
(simerr[j] = abs(simans[j] - preans)/preans * 100)
(oaerr[j] = abs(oaans[j] - preans)/preans * 100)
 
mp[j] = simerr[j] < oaerr[j]
}
    (outframe = data.frame("simpsons percent diff" = simerr,"trapezoidal percent diff" = oaerr, "more precise?" = mp, check.names = F))
   simpsons percent diff trapezoidal percent diff more precise?
1           214.73489738               214.734897         FALSE
2            15.07958148                64.993410          TRUE
3             6.70203621                29.816799          TRUE
4             0.94247384                16.955208          TRUE
5             0.54830021                10.905620          TRUE
6             0.18616767                 7.593825          TRUE
7             0.12051767                 5.588209          TRUE
8             0.05890462                 4.282980          TRUE
9             0.04087107                 3.386525          TRUE
10            0.02412733                 2.744500          TRUE

相关问题