for循环在数据库中的应用

bbmckpt7  于 2021-07-14  发布在  Spark
关注(0)|答案(0)|浏览(213)

我尝试在azure上的databricks中使用spark\u apply并行运行一个包含for循环的函数。
我的职能是:

distribution <- function(sims){
 for (p in 1:100){
  increment_value <- list()
  profiles <- list()
  samples <- list()
  sample_num <- list()
  for (i in 1:length(samp_seq)){
    w <- sample(sims, size=batch)
    z <- sum(w)
    name3 <- as.character(z)
    samples[[name3]] <- data.frame(value = z)
  }
} 
}

当我把函数放在spark中时,应用如下:

sdf_len(sc,1) %>%
  spark_apply(distribution)

我得到以下错误:

Error : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 305.0 failed 4 times, most recent failure: Lost task 0.3 in stage 305.0 (TID 297, 10.139.64.6, executor 0): java.lang.Exception: sparklyr worker rscript failure with status 255, check worker logs for details. Error : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 305.0 failed 4 times, most recent failure: Lost task 0.3 in stage 305.0 (TID 297, 10.139.64.6, executor 0): java.lang.Exception: sparklyr worker rscript failure with status 255, check worker logs for details.

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题