我设法在macv10.15.7和我的一个pycharm项目(我们称之为项目a)上本地设置了spark。然而,我无法在另一个pycharm项目(projectb)中启动spark,我只是使用与projecta相同的解释器来设置这个项目。
在projectb环境中,我似乎可以调用spark会话。当我去 http://localhost:4040/
启动了一个spark会议。但是,当我开始执行命令时,我收到如下消息
Exception: Python in worker has different version 2.7 than that in driver 3.7, PySpark cannot run with different minor versions. Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set.
当我打电话的时候 pyspark
在项目b中,我得到了下面的错误消息。尽管我通过从projecta和macbook运行相同的命令来调用spark Terminal
.
macbook:projectB byc$ pyspark
Could not find valid SPARK_HOME while searching ['/Users/byc/PycharmProjects', '/Library/Frameworks/Python.framework/Versions/3.7/bin']
Did you install PySpark via a package manager such as pip or Conda? If so,
PySpark was not found in your Python environment. It is possible your
Python environment does not properly bind with your package manager.
Please check your default 'python' and if you set PYSPARK_PYTHON and/or
PYSPARK_DRIVER_PYTHON environment variables, and see if you can import
PySpark, for example, 'python -c 'import pyspark'.
If you cannot import, you can install by using the Python executable directly,
for example, 'python -m pip install pyspark [--user]'. Otherwise, you can also
explicitly set the Python executable, that has PySpark installed, to
PYSPARK_PYTHON or PYSPARK_DRIVER_PYTHON environment variables, for example,
'PYSPARK_PYTHON=python3 pyspark'.
/Library/Frameworks/Python.framework/Versions/3.7/bin/pyspark: line 24: /bin/load-spark-env.sh: No such file or directory
/Library/Frameworks/Python.framework/Versions/3.7/bin/pyspark: line 68: /bin/spark-submit: No such file or directory
/Library/Frameworks/Python.framework/Versions/3.7/bin/pyspark: line 68: exec: /bin/spark-submit: cannot execute: No such file or directory
浏览这里的各种帖子,我添加了我的环境变量
PYTHONUNBUFFERED=1
PYSPARK_PYTHON=/Download/spark-3.0.1-bin-hadoop2.7
PYSPARK_DRIVER_PYTHON=/Download/spark-3.0.1-bin-hadoop2.7
SPARK_HOME=/usr/local/Cellar/apache-spark/3.0.1/libexec
PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.9-src.zip:$PYTHONPATH
再次关闭project b pycharm,重新打开并再次运行命令。还是不走运。
我肯定我错过了一些明显的部分,但就是不知道他们是什么!非常感谢您的指点!
2条答案
按热度按时间sycxhyv71#
尝试用pip安装程序安装pyspark。
如果您在本地使用它,我还建议您设置以下内容:
sy5wg1nm2#
尝试
SPARK_HOME=/Download/spark-3.0.1-bin-hadoop2.7
.不需要设置
PYTHONPATH
.if设置
SPARK_HOME
如果不起作用,则可能需要在中指定python可执行文件的正确路径PYSPARK_PYTHON
. 你提供的路径看起来不对。也许会是这样PYSPARK_PYTHON=/usr/bin/python3
.