尝试运行pyspark时无法初始化主类org.apache.spark.deploy.sparksubmit

vybvopom  于 2021-05-24  发布在  Spark
关注(0)|答案(0)|浏览(581)

我有一个 conda 安装 python 3.7$python3 --version Python 3.7.6pyspark 已通过安装 pip3 install ( conda 没有本机包)。

$conda list | grep pyspark
pyspark                   2.4.5                    pypi_0    pypi

这是什么 pip3 告诉我:

$pip3 install pyspark
Requirement already satisfied: pyspark in ./miniconda3/lib/python3.7/site-packages (2.4.5)
Requirement already satisfied: py4j==0.10.7 in ./miniconda3/lib/python3.7/site-packages (from pyspark) (0.10.7)
``` `jdk 11` 已安装:

$java -version
openjdk version "11.0.2" 2019-01-15
OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)

当试图 `import pyspark` 事情不太顺利。下面是一个小测试程序:

from pyspark.sql import SparkSession
import os, sys
def setupSpark():
os.environ["PYSPARK_SUBMIT_ARGS"] = "pyspark-shell"
spark = SparkSession.builder.appName("myapp").master("local").getOrCreate()
return spark

sp = setupSpark()
df = sp.createDataFrame({'a':[1,2,3],'b':[4,5,6]})
df.show()

结果是:
错误:无法初始化主类org.apache.spark.deploy.sparksubmit,原因是:java.lang.noclassdeffounderror:org/apache/log4j/spi/filter
详情如下:

$python3 sparktest.py
Error: Unable to initialize main class org.apache.spark.deploy.SparkSubmit
Caused by: java.lang.NoClassDefFoundError: org/apache/log4j/spi/Filter
Traceback (most recent call last):
File "sparktest.py", line 9, in
sp = setupSpark()
File "sparktest.py", line 6, in setupSpark
spark = SparkSession.builder.appName("myapp").master("local").getOrCreate()
File "/Users/steve/miniconda3/lib/python3.7/site-packages/pyspark/sql/session.py", line 173, in getOrCreate
sc = SparkContext.getOrCreate(sparkConf)
File "/Users/steve/miniconda3/lib/python3.7/site-packages/pyspark/context.py", line 367, in getOrCreate
SparkContext(conf=conf or SparkConf())
File "/Users/steve/miniconda3/lib/python3.7/site-packages/pyspark/context.py", line 133, in init
SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
File "/Users/steve/miniconda3/lib/python3.7/site-packages/pyspark/context.py", line 316, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway(conf)
File "/Users/steve/miniconda3/lib/python3.7/site-packages/pyspark/java_gateway.py", line 46, in launch_gateway
return _launch_gateway(conf)
File "/Users/steve/miniconda3/lib/python3.7/site-packages/pyspark/java_gateway.py", line 108, in _launch_gateway
raise Exception("Java gateway process exited before sending its port number")
Exception: Java gateway process exited before sending its port number

任何关于康达工作环境的建议或信息都将不胜感激。

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题