spark 2.4.4在spark会话中声明时失败的核心数

z18hc3ub  于 2021-05-27  发布在  Spark
关注(0)|答案(0)|浏览(287)

我在spark应用程序资源方面遇到了一个非常奇怪的问题。我的spark-defaults.conf文件如下所示。

spark.executor.memory            9486M
spark.executor.cores             4

在我的应用程序代码中,我声明没有大于默认值的内核

spark = SparkSession \
    .builder \
    .enableHiveSupport()\
    .config("spark.executor.memory", "8g")\
    .config("spark.executor.cores", "6")\
    .appName(app_name)\
    .getOrCreate()

spark提交看起来像

spark-submit --master yarn --deploy-mode cluster main.py

我的应用程序失败,错误如下

LogType:stdout
Log Upload Time:Sun Aug 16 13:02:27 +0000 2020
LogLength:22066
Log Contents:
ERROR:root:Exception while sending command.
Traceback (most recent call last):
  File "/mnt1/yarn/usercache/hadoop/appcache/application_1592991612264_0475/container_1592991612264_0475_01_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1159, in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/mnt1/yarn/usercache/hadoop/appcache/application_1592991612264_0475/container_1592991612264_0475_01_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 985, in send_command
    response = connection.send_command(command)
  File "/mnt1/yarn/usercache/hadoop/appcache/application_1592991612264_0475/container_1592991612264_0475_01_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1164, in send_command
    "Error while receiving", e, proto.ERROR_ON_RECEIVE)
py4j.protocol.Py4JNetworkError: Error while receiving
Traceback (most recent call last):
  File "main.py", line 50, in <module>
    .appName(app_name)\
  File "/mnt1/yarn/usercache/hadoop/appcache/application_1592991612264_0475/container_1592991612264_0475_01_000001/pyspark.zip/pyspark/sql/session.py", line 173, in getOrCreate
  File "/mnt1/yarn/usercache/hadoop/appcache/application_1592991612264_0475/container_1592991612264_0475_01_000001/pyspark.zip/pyspark/context.py", line 375, in getOrCreate
  File "/mnt1/yarn/usercache/hadoop/appcache/application_1592991612264_0475/container_1592991612264_0475_01_000001/pyspark.zip/pyspark/context.py", line 136, in __init__
  File "/mnt1/yarn/usercache/hadoop/appcache/application_1592991612264_0475/container_1592991612264_0475_01_000001/pyspark.zip/pyspark/context.py", line 198, in _do_init
  File "/mnt1/yarn/usercache/hadoop/appcache/application_1592991612264_0475/container_1592991612264_0475_01_000001/pyspark.zip/pyspark/context.py", line 314, in _initialize_context
  File "/mnt1/yarn/usercache/hadoop/appcache/application_1592991612264_0475/container_1592991612264_0475_01_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1525, in __call__
  File "/mnt1/yarn/usercache/hadoop/appcache/application_1592991612264_0475/container_1592991612264_0475_01_000001/py4j-0.10.7-src.zip/py4j/protocol.py", line 336, in get_return_value
py4j.protocol.Py4JError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:45417)
Traceback (most recent call last):
  File "/mnt1/yarn/usercache/hadoop/appcache/application_1592991612264_0475/container_1592991612264_0475_01_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 929, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/mnt1/yarn/usercache/hadoop/appcache/application_1592991612264_0475/container_1592991612264_0475_01_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1067, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:45417)
Traceback (most recent call last):
  File "/mnt1/yarn/usercache/hadoop/appcache/application_1592991612264_0475/container_1592991612264_0475_01_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 929, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

日志文件很大,但这是相关信息。现在,有趣的是,当我把核心数减少到4以下时,它就工作了。有人面临过这样的问题吗?请帮帮我。
我的spark版本:

____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 2.4.4
      /_/

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题