无法在另一个pycharm项目中运行pyspark

ijxebb2r  于 2021-05-16  发布在  Spark
关注(0)|答案(2)|浏览(966)

我设法在macv10.15.7和我的一个pycharm项目(我们称之为项目a)上本地设置了spark。然而,我无法在另一个pycharm项目(projectb)中启动spark,我只是使用与projecta相同的解释器来设置这个项目。
在projectb环境中,我似乎可以调用spark会话。当我去 http://localhost:4040/ 启动了一个spark会议。但是,当我开始执行命令时,我收到如下消息

Exception: Python in worker has different version 2.7 than that in driver 3.7, PySpark cannot run with different minor versions. Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set.

当我打电话的时候 pyspark 在项目b中,我得到了下面的错误消息。尽管我通过从projecta和macbook运行相同的命令来调用spark Terminal .

macbook:projectB byc$ pyspark
Could not find valid SPARK_HOME while searching ['/Users/byc/PycharmProjects', '/Library/Frameworks/Python.framework/Versions/3.7/bin']

Did you install PySpark via a package manager such as pip or Conda? If so,
PySpark was not found in your Python environment. It is possible your
Python environment does not properly bind with your package manager.

Please check your default 'python' and if you set PYSPARK_PYTHON and/or
PYSPARK_DRIVER_PYTHON environment variables, and see if you can import
PySpark, for example, 'python -c 'import pyspark'.

If you cannot import, you can install by using the Python executable directly,
for example, 'python -m pip install pyspark [--user]'. Otherwise, you can also
explicitly set the Python executable, that has PySpark installed, to
PYSPARK_PYTHON or PYSPARK_DRIVER_PYTHON environment variables, for example,
'PYSPARK_PYTHON=python3 pyspark'.

/Library/Frameworks/Python.framework/Versions/3.7/bin/pyspark: line 24: /bin/load-spark-env.sh: No such file or directory
/Library/Frameworks/Python.framework/Versions/3.7/bin/pyspark: line 68: /bin/spark-submit: No such file or directory
/Library/Frameworks/Python.framework/Versions/3.7/bin/pyspark: line 68: exec: /bin/spark-submit: cannot execute: No such file or directory

浏览这里的各种帖子,我添加了我的环境变量

PYTHONUNBUFFERED=1
PYSPARK_PYTHON=/Download/spark-3.0.1-bin-hadoop2.7
PYSPARK_DRIVER_PYTHON=/Download/spark-3.0.1-bin-hadoop2.7
SPARK_HOME=/usr/local/Cellar/apache-spark/3.0.1/libexec
PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.9-src.zip:$PYTHONPATH

再次关闭project b pycharm,重新打开并再次运行命令。还是不走运。
我肯定我错过了一些明显的部分,但就是不知道他们是什么!非常感谢您的指点!

sycxhyv7

sycxhyv71#

尝试用pip安装程序安装pyspark。

pip install pyspark==2.4.7

如果您在本地使用它,我还建议您设置以下内容:

export SPARK_LOCAL_IP="127.0.0.1"
export PYSPARK_PYTHON=python3.6
sy5wg1nm

sy5wg1nm2#

尝试 SPARK_HOME=/Download/spark-3.0.1-bin-hadoop2.7 .
不需要设置 PYTHONPATH .
if设置 SPARK_HOME 如果不起作用,则可能需要在中指定python可执行文件的正确路径 PYSPARK_PYTHON . 你提供的路径看起来不对。也许会是这样 PYSPARK_PYTHON=/usr/bin/python3 .

相关问题