安装和运行pyspark的问题

tv6aics1  于 2021-05-27  发布在  Spark
关注(0)|答案(1)|浏览(442)

我很抱歉问一个我以前在这里见过的问题,但我所得到的答案似乎都不能解决这个问题。我遵循了在本地机器上运行pyspark的安装文档。完成后,我将尝试使用


# Start pyspark via provided command

import pyspark

# Below code is Spark 2+

spark = pyspark.sql.SparkSession.builder.appName('test').getOrCreate()

spark.range(10).collect()

但我不断得到以下错误:

/Users/usr123/opt/anaconda3/lib/python3.7/site-packages/pyspark/bin/spark-class: line 71: /usr/bin/java/bin/java: Not a directory
Traceback (most recent call last):
  File "test.py", line 5, in <module>
    spark = pyspark.sql.SparkSession.builder.appName('test').getOrCreate()
  File "/Users/usr123/opt/anaconda3/lib/python3.7/site-packages/pyspark/sql/session.py", line 173, in getOrCreate
    sc = SparkContext.getOrCreate(sparkConf)
  File "/Users/usr123/opt/anaconda3/lib/python3.7/site-packages/pyspark/context.py", line 349, in getOrCreate
    SparkContext(conf=conf or SparkConf())
  File "/Users/usr123/opt/anaconda3/lib/python3.7/site-packages/pyspark/context.py", line 115, in __init__
    SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
  File "/Users/usr123/opt/anaconda3/lib/python3.7/site-packages/pyspark/context.py", line 298, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway(conf)
  File "/Users/usr123/opt/anaconda3/lib/python3.7/site-packages/pyspark/java_gateway.py", line 94, in launch_gateway
    raise Exception("Java gateway process exited before sending its port number")
Exception: Java gateway process exited before sending its port number

有没有人找到一个好办法来确保这个问题得到纠正?我有什么明显的遗漏吗?

k3bvogb1

k3bvogb11#

我们也遇到了类似的问题,对于我们来说,将python版本降低到3.6是为了解决这些问题,在我们的例子中,这似乎是一个不兼容conda环境的例子。你想运行哪种版本的spark?它可能是2.1、2.3、2.4之间的实质性差异。

相关问题