使用rmultinom来估计节日聚会礼物的分配,以获得一个列表/估计

cu6pst1q  于 12个月前  发布在  其他
关注(0)|答案(1)|浏览(139)

有一个6人的节日聚会,6人中的每一个人都有一个关于他们的gifts_brought的估计,每个人都有一个关于他们收到另一个人礼物的机会的估计(gifts_received_pct).一个人可以从另一个人那里收到的礼物数量没有限制。他们只能给予/接收来自自己团队的礼物,他们不能把自己带来的礼物送给自己。
我的真实的问题有1000个不同的迭代/估计参数gifts_broughtgifts_received_pct,我想估计所有1000个估计中每个人收到的礼物总数。为了这个练习的目的,我将使所有的估计相同,但我想明确的是,实际上,我所有的嵌套都对参数有不同的估计,这就是为什么我不能只做rmultinom(1000, ...)
首先构建虚拟代码。

name <- c('Aaron', 'Susie', 'Sam', 'Emma', 'Jennifer', 'Steve')
giftsBrought <- c(5, 3, 4, 2, 3, 6)
team <- c('Sales', 'Sales', 'Sales', 'IT', 'IT', 'IT')
gifts_received_pct <- c(.2, .3, .1, .2, .2, .1) # rmultinom does not require normilazation
giftsDF <- data.frame(name, team, giftsBrought, gifts_received_pct, stringsAsFactors = FALSE)

giftsEstimationList <- list()
for(i in 1:1000){
  giftsEstimationList[[i]] <- giftsDF
}

字符串
接下来,这是我如何为其中一个数组获取gifts_received计算:

giftsReceivedDF <- lapply(1:nrow(giftsDF), function(i){
  probs <- giftsDF
  probs$gifts_received_pct[probs$team != giftsDF$team[i] | probs$name == giftsDF$name[i]] <- 0 # set other_team_pct and own_pct to 0
  rmultinom(1, giftsDF$giftsBrought[i], probs$gifts_received_pct)
})

Reduce(`+`, giftsReceivedDF)


我相信这是正确的-当仔细查看giftsReceivedDF时,似乎没有人收到过自己的礼物,而另一个团队也没有收到任何礼物。
让我困惑的是如何及时地在giftsEstimationList中的所有1000个嵌套框中运行它。我最初试图用一堆for循环来强制执行所有内容,但我不相信这是最有效的,而且时间在这里相当重要。

nfzehxib

nfzehxib1#

一个模拟礼物交换的函数。它在所有团队和迭代中被向量化。它只需要循环最大参与者数量(示例数据中为3)。

library(matrixStats) # for `rowCumsums` and `colCumsums`
library(data.table)

f <- function(giftsEstimationList) {
  # combine the list into a single table
  dt <- rbindlist(giftsEstimationList, TRUE, FALSE, "iter")
  # get the size of each exchange group (a team within an iteration)
  n <- dt[,.N, .(iter, team)][[3]]
  maxcol <- max(n) # the maximum sized exchange group
  # a matrix of relative probabilities of the destination probabilities of
  # each participant's gifts (row 1, column 3 = probability that a gift from
  # Aaron will go to Sam in iteration 1; row 11, column 1 = probability that
  # Jennifer's gift will go to Emma in iteration 2)
  m <- matrix( 
    unlist(
      dt[
        ,.(.(c(gifts_received_pct*(1 - diag(maxcol)),
               numeric(.N*(maxcol - .N))))),
        .(iter, team)
      ][[3]]
    ),
    nrow(dt), maxcol, 1
  )
  # get the probability to use in `rbinom`
  # (see https://en.wikipedia.org/wiki/Multinomial_distribution#Sampling_using_repeated_conditional_binomial_samples)
  m[,2:maxcol] <- m[,2:maxcol]/rowCumsums(m)[,2:maxcol]
  giftsRemaining <- dt$giftsBrought
  # distribute the gifts
  for (j in maxcol:2) {
    m[,j] <- rbinom(nrow(dt), giftsRemaining, m[,j])
    giftsRemaining <- giftsRemaining - m[,j]
  }
  
  m[,1] <- giftsRemaining
  # aggregate the gifts received for each participant
  dt[
    ,giftsReceived := diff(rbind(0, colCumsums(m)[cumsum(n),]))[
      sequence(n, 1:nrow(dt), length(n))
    ]
  ]
}

字符串
演示:

set.seed(474180891)
system.time(dt <- f(giftsEstimationList))
#>    user  system elapsed 
#>    0.06    0.00    0.06
dim(dt)
#> [1] 6000    6
# show the first 12 rows of the result
dt[1:12]
#>     iter     name  team giftsBrought gifts_received_pct giftsReceived
#>  1:    1    Aaron Sales            5                0.2             3
#>  2:    1    Susie Sales            3                0.3             7
#>  3:    1      Sam Sales            4                0.1             2
#>  4:    1     Emma    IT            2                0.2             4
#>  5:    1 Jennifer    IT            3                0.2             7
#>  6:    1    Steve    IT            6                0.1             0
#>  7:    2    Aaron Sales            5                0.2             4
#>  8:    2    Susie Sales            3                0.3             6
#>  9:    2      Sam Sales            4                0.1             2
#> 10:    2     Emma    IT            2                0.2             7
#> 11:    2 Jennifer    IT            3                0.2             3
#> 12:    2    Steve    IT            6                0.1             1

相关问题