我正在尝试使用以下参数创建Spark连接:
library(sparklyr)
conf <- spark_config()
conf$`sparklyr.cores.local` <- 6
conf$`sparklyr.shell.driver-memory` <- "16G"
conf$`spark.executor.cores` <- 2
conf$`spark.executor.memory` <- "2G"
conf$`sparklyr.verbose` <- TRUE
conf$`sparklyr.log.console` <- TRUE
conf$`spark.executor.instances` <- 4
conf$`spark.dynamicAllocation.enabled` <- FALSE
sc <- spark_connect(master = "local", config = conf, log = "console", version = "3.0.0")
它确实连接并正确地显示 spark_session_config(sc)
:
$spark.executor.instances
[1] "4"
$spark.executor.cores
[1] "2"
$spark.driver.memory
[1] "16G"
$spark.master
[1] "local[16]"
$spark.sql.shuffle.partitions
[1] "16"
$spark.sql.legacy.utcTimestampFunc.enabled
[1] "true"
$spark.dynamicAllocation.enabled
[1] "false"
$spark.driver.port
[1] "65404"
$spark.submit.deployMode
[1] "client"
$spark.executor.id
[1] "driver"
$spark.jars
[1] "file:/C:/Users/B2623385/Documents/R/win-library/3.6/sparklyr/java/sparklyr-3.0-2.12.jar"
$spark.submit.pyFiles
[1] ""
$spark.app.id
[1] "local-1600432415127"
$spark.env.SPARK_LOCAL_IP
[1] "127.0.0.1"
$spark.sql.catalogImplementation
[1] "hive"
$spark.executor.memory
[1] "2G"
$spark.spark.port.maxRetries
[1] "128"
$spark.app.name
[1] "sparklyr"
$spark.home
[1] "C:\\Users\\B2623385\\AppData\\Local\\spark\\spark-3.0.0-bin-hadoop2.7"
$spark.driver.host
[1] "127.0.0.1"
然而,当我去 http://127.0.0.1:4040/executors/
,这表明我只运行了驱动程序执行器:
我已经尝试过切换spark的版本,并且还声明了一个最简陋的环境,但是,我不断遇到同样的问题。我错过了什么?
我的最终目标是 copy_to()
一个data.frame进入spark连接,r继续运行,而 http://127.0.0.1:4040/executors/
看来什么都没发生。
暂无答案!
目前还没有任何答案,快来回答吧!