具有随机目标函数和等式约束的R中的最优化

kx1ctssn  于 2023-04-27  发布在  其他
关注(0)|答案(1)|浏览(108)

我正在寻找一个R软件包/函数,它可以帮助我找到4个参数的最优值。具体来说,我正在尝试对一个已知的经验分布进行建模(美国家庭收入分布,Y)作为两个对数正态随机变量之和(即,Y=E+L,其中ln(E)~N(μE,σE2)和ln(L)~N(μL,σL2)。因此,我有4个参数要优化。对于目标函数,我希望使用Kolmogorov-Smirnov统计量来测量使用这4个参数的模拟分布与我存储为向量的实际分布之间的差异。L的期望值与Y的总期望值的比例为0〈θ〈1。
到目前为止,我已经尝试了nloptr包和alabama包,但都没有成功。下面是我的目标函数和约束函数,首先是目标函数:

compare_distributions <- function(x, ref_dist, ...) {
  mu_E <- x[1]
  sigma_E <- x[2]
  mu_L <- x[3]
  sigma_L <- x[4]
  E <- rlnorm(n = 100000, meanlog = mu_E, sdlog = sigma_E)
  L <- rlnorm(n = 100000, meanlog = mu_L, sdlog = sigma_L)
  Y <- E + L
  return(unname(ks.test(Y, ref_dist)$statistic))
}

在这里你可以看到,我生成了模拟随机分布的100,000次绘制,然后使用ks.test函数计算统计量,将其与参考分布的相似性进行比较。然后,约束函数:

opt_constraint <- function(x, theta, ...) {
  mu_E <- x[1]
  sigma_E <- x[2]
  mu_L <- x[3]
  sigma_L <- x[4]
  return(mu_L + sigma_L^2/2 - mu_E - sigma_E^2/2 - log(theta / (1- theta)))
}

这里,当满足关于对数正态分布的均值的比率的优化约束时,返回的表达式等于0。
当我将这些表达式输入亚拉巴马时,我收到了一条隐晦的错误消息(注意household_income是包含已知家庭收入分布的向量):

alabama(par = c(0, 1, 0, 1),
       fn = compare_distributions,
       heq = opt_constraint,
       ref_dist = household_income,
       theta = 0.5)

产量:

Error in alabama(par = c(0, 1, 0, 1), fn = compare_distributions, heq = opt_constraint,  : 
  promise already under evaluation: recursive default argument reference or earlier problems?

我不知道我在这里做了什么递归,并环顾这个论坛和其他人的帮助,这个错误消息没有发现任何有用的东西。
我在想,也许nloptralabama不适合我试图优化的目标函数,因为它具有随机性。有没有任何软件包可以实现我正在寻找的目标?

1zmg4dgp

1zmg4dgp1#

我已经能够用下面的代码最小化你的函数:

library(DEoptim)
library(RSolnp)

objective_Function <- function(param, ref_dist, theta)
{
  mu_E <- param[1]
  sigma_E <- param[2]
  mu_L <- param[3]
  sigma_L <- param[4]
  set.seed(1234)
  E <- rlnorm(n = 100000, meanlog = mu_E, sdlog = sigma_E)
  L <- rlnorm(n = 100000, meanlog = mu_L, sdlog = sigma_L)
  Y <- E + L
  term1 <- unname(ks.test(Y, ref_dist)$statistic)
  term2 <- mu_L + sigma_L ^ 2 / 2 - mu_E - sigma_E ^ 2 / 2 - log(theta / (1 - theta))
  term2 <- exp(term2) - 1
  val <- term1 + 100 * term2 ^ 2
  return(val)    
}

opt_constraint <- function(x) 
{
  theta <- 0.5
  mu_E <- x[1]
  sigma_E <- max(x[2], 0)
  mu_L <- x[3]
  sigma_L <- max(x[4], 0)
  val <- mu_L + sigma_L ^ 2 / 2 - mu_E - sigma_E ^ 2 / 2 - log(theta / (1 - theta))
  return(val)
}

compare_distributions <- function(x) 
{
  ref_dist <- household_income
  mu_E <- x[1]
  sigma_E <- max(x[2], 0)
  mu_L <- x[3]
  sigma_L <- max(x[4], 0)
  set.seed(1234)
  E <- rlnorm(n = 100000, meanlog = mu_E, sdlog = sigma_E)
  L <- rlnorm(n = 100000, meanlog = mu_L, sdlog = sigma_L)
  Y <- E + L
  return(unname(ks.test(Y, ref_dist)$statistic))
}

set.seed(12435)
household_income <- rlnorm(100000, 3, 1)
list_Obj_Res <- list()

obj_DEoptim <- DEoptim(fn = objective_Function, lower = c(-4, 0, -4, 0), upper = rep(7, 4),
                       ref_dist = household_income, theta = 0.5,
                       control = list(itermax = 200, parallelType = 1))

list_Obj_Res[[1]] <- list(par = obj_DEoptim$optim$bestmem,
                          value = obj_DEoptim$optim$bestval)

for(i in 2 : 10)
{
  print(i)
  
  if(i %% 2 == 1)
  {
    obj_Optim <- optim(fn = objective_Function, par = list_Obj_Res[[i - 1]]$par,
                       ref_dist = household_income, theta = 0.5)
    
    list_Obj_Res[[i]] <- list(par = obj_Optim$par, value =  obj_Optim$value)
    
  }else 
  {
    obj_RSolnp <- solnp(pars = list_Obj_Res[[i - 1]]$par, 
                        fun = compare_distributions, 
                        eqfun = opt_constraint, eqB = 0)
    
    list_Obj_Res[[i]] <- list(par = obj_RSolnp$pars, value = obj_RSolnp$values)
    
  }
  
  print(list_Obj_Res[[i]])
}

opt_constraint(x = tail(list_Obj_Res, 1)[[1]]$par)
        par3 
1.110223e-16 

> compare_distributions(x = tail(list_Obj_Res, 1)[[1]]$par)
[1] 0.0047

> tail(list_Obj_Res, 1)[[1]]$par
    par1     par2     par3     par4 
2.016786 1.285449 1.947603 1.338188

相关问题