spark提交和py文件

tzxcd3kk  于 2021-05-16  发布在  Spark
关注(0)|答案(0)|浏览(325)

全部,
我是大数据新手,我在我的戴尔alienware桌面上用UbuntuV18:04安装了一个3节点的apache spark集群。我没有Yarn设置,我命名的节点如下。
Spark匠
sparkworker1(从属)
sparkworker2(从属)
我也在sparkmaster上安装了anaconda,因为我想在jupyter笔记本上工作-但我不确定是否需要anaconda-我能从pip3安装jupyter吗?
下面的代码在jupyter notebook中以交互方式运行得很好,我想看看当我将其作为作业提交时是否也能正常工作。

from pyspark.sql import SQLContext
from pyspark.sql.types import *

sqlContext = SQLContext(sc)

df = sqlContext.read.load('/home/grajee/twitter/US_Politicians_Twitter.csv',
                      format='com.databricks.spark.csv',
                      header='true',
                      inferSchema='true')

df.write.csv('/home/grajee/twitter/US_Politicians_loaded.csv')

所以,我运行了spark命令“spark submit pytest.py”,它导致了以下错误。

(base) grajee@SparkMaster:~/pyscript$ spark-submit pytest.py
20/11/27 16:43:41 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Traceback (most recent call last):
  File "/home/grajee/anaconda3/bin/jupyter", line 11, in <module>
    sys.exit(main())
  File "/home/grajee/anaconda3/lib/python3.8/site-packages/jupyter_core/command.py", line 247, in main
    command = _jupyter_abspath(subcommand)
  File "/home/grajee/anaconda3/lib/python3.8/site-packages/jupyter_core/command.py", line 133, in _jupyter_abspath
    raise Exception(
Exception: Jupyter command `jupyter-/home/grajee/pyscript/pytest.py` not found.
log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

(base) grajee@SparkMaster:~/pyscript$ pwd
/home/grajee/pyscript

(base) grajee@SparkMaster:~/pyscript$ ls -l pytest.py
-rw-r--r-- 1 root root 374 Nov 27 14:58 pytest.py
(base) grajee@SparkMaster:~/pyscript$

我有几个问题:
[1] 当我调用jupyter notebook时,如何确保它是针对独立集群而不是本地集群运行的。当我错误地尝试启动另一个sparkcontext时,我得到了下面列出的错误。当我期望它在独立模式下运行时,它似乎表明它没有在本地模式下运行
不能同时运行多个SparkContext;由at/home/grajee/anaconda3/lib/python3.8/site packages/ipython/utils/py3compat.py创建的现有sparkcontext(app=pysparkshell,master=local[*]):168
[2] 为什么我会有例外
异常:jupyter命令 jupyter-/home/grajee/pyscript/pytest.py 找不到。
[3] 卸载anaconda并用“pip3 install jupyter”替换它会更好吗

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题