通过spark2.0.0提交任务,为什么要运行spark版本1.5.2?

qgelzfjb  于 2021-06-26  发布在  Mesos
关注(0)|答案(1)|浏览(414)

我尝试将spark1.5.2更新为spark2.0.0,在两台机器(node3,node7)中进行测试,通过spark2.0.0/spark-submit提交任务,但任务将在spark1.5.2中运行
我在node3提交任务时出错

~/software/spark-2.0.0-bin-hadoop2.6/bin$ spark-submit --master mesos://192.168.1.5050  ../examples/src/main/python/pimy.py

mesos executes stderr登录node7

sh: 1: /home/jianxun/software/spark-1.5.2-bin-hadoop2.6/bin/spark-class: not found

节点3:jdk:

openjdk version "1.8.0_91"
OpenJDK Runtime Environment (build 1.8.0_91-8u91-b14-3ubuntu1~15.10.1-b14)
OpenJDK 64-Bit Server VM (build 25.91-b14, mixed mode)

节点3/etc/profile

export M2_HOME=/usr/share/maven
export M2=$M2_HOME/bin
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export PATH=$JAVA_HOME/bin:$PATH
export PATH=/home/jianxun/software/mongodb-linux-x86_64-3.2.0/bin:$PATH
export HIVE_HOME=/home/jianxun/software/apache-hive-2.0.1-bin
export PATH=$HIVE_HOME/bin:$PATH
export CLASSPATH=$CLASSPATH:/usr/share/java/mysql.jar
export SPARK_HOME=/home/jianxun/software/spark-2.0.0-bin-hadoop2.6

节点7:jdk:

openjdk version "1.8.0_91"
OpenJDK Runtime Environment (build 1.8.0_91-8u91-b14-3ubuntu1~15.10.1-b14)
OpenJDK 64-Bit Server VM (build 25.91-b14, mixed mode)

节点3/etc/profile

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export PATH=$JAVA_HOME/bin:$PATH
export SPARK_HOME=/home/jianxun/software/spark-2.0.0-bin-hadoop2.6
export PYTHONPATH=/usr/lib/python2.7

mesos verson是0.25,mesos master是node3,只有一个mesos slave是node7。node3有两个版本:
~/software/spark-2.0.0-bin-hadoop2.6/
~/software/spark-1.5.2-bin-hadoop2.6/
节点3中的spark配置:
spark-env.sh公司

export MESOS_NATIVE_JAVA_LIBRARY=/home/jianxun/software/mesos/lib/libmesos-0.25.0.so
export SCALA_HOME=/usr/share/scala-2.11
export SPARK_EXCUTOR_URI=/home/jianxun/software/spark-2.0.0-bin-hadoop2.6.tgz

spark-defaults.conf格式

spark.local.dir                    /data/sparktmp
spark.shuffle.service.enabled      true
spark.mesos.coarse                 true
spark.executor.memory              24g
spark.executor.cores               7
spark.cores.max                    7
spark.executor.uri                 /home/jianxun/software/spark-2.0.0-bin-hadoop2.6.tgz

node7只有新版本spark:
~/software/spark-2.0.0-bin-hadoop2.6/
~/software/spark-2.0.0-bin-hadoop2.6.tgz(二进制文件)
spark提交日志:(重要部分由****解决)


*********************************************************
*********************************************************

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/jianxun/software/spark-1.5.2-bin-hadoop2.6/lib/spark-examples-1.5.2-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/jianxun/software/spark-1.5.2-bin-hadoop2.6/lib/spark-assembly-1.5.2-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
16/08/03 12:31:33 INFO SparkContext: Running Spark version 1.5.2

*****************************************************************
*****************************************************************

16/08/03 12:31:34 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/08/03 12:31:34 WARN SparkConf: In Spark 1.0 and later spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN).
16/08/03 12:31:34 INFO SecurityManager: Changing view acls to: jianxun
16/08/03 12:31:34 INFO SecurityManager: Changing modify acls to: jianxun
16/08/03 12:31:34 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(jianxun); users with modify permissions: Set(jianxun)
16/08/03 12:31:34 INFO Slf4jLogger: Slf4jLogger started
16/08/03 12:31:34 INFO Remoting: Starting remoting
16/08/03 12:31:34 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@192.168.1.203:40978]
16/08/03 12:31:34 INFO Utils: Successfully started service 'sparkDriver' on port 40978.
16/08/03 12:31:34 INFO SparkEnv: Registering MapOutputTracker
16/08/03 12:31:34 INFO SparkEnv: Registering BlockManagerMaster
16/08/03 12:31:34 INFO DiskBlockManager: Created local directory at /data/sparktmp/blockmgr-76944d0c-de18-4f52-9249-8c3ca6141f59
16/08/03 12:31:34 INFO MemoryStore: MemoryStore started with capacity 12.4 GB
16/08/03 12:31:34 INFO HttpFileServer: HTTP File server directory is /data/sparktmp/spark-eba79d72-dd11-4d5d-a008-9964522fcc24/httpd-a64948d7-9e78-42f0-b711-84fc5f040517
16/08/03 12:31:34 INFO HttpServer: Starting HTTP Server
16/08/03 12:31:35 INFO Utils: Successfully started service 'HTTP file server' on port 35616.
16/08/03 12:31:35 INFO SparkEnv: Registering OutputCommitCoordinator
16/08/03 12:31:35 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/08/03 12:31:35 INFO SparkUI: Started SparkUI at http://192.168.1.203:4040
16/08/03 12:31:35 INFO Utils: Copying /home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py to /data/sparktmp/spark-eba79d72-dd11-4d5d-a008-9964522fcc24/userFiles-03a46142-7a44-43d0-82de-10c174721a99/pimy.py
16/08/03 12:31:35 INFO SparkContext: Added file file:/home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py at http://192.168.1.203:35616/files/pimy.py with timestamp 1470198695252
16/08/03 12:31:35 WARN SparkContext: Using SPARK_MEM to set amount of memory to use per executor process is deprecated, please use spark.executor.memory instead.
16/08/03 12:31:35 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
I0803 12:31:35.419636 32575 sched.cpp:164] Version: 0.25.0
I0803 12:31:35.430359 32570 sched.cpp:262] New master detected at master@192.168.1.203:5050
I0803 12:31:35.431447 32570 sched.cpp:272] No credentials provided. Attempting to register without authentication
I0803 12:31:35.434844 32570 sched.cpp:641] Framework registered with ff2cf87e-3712-413f-a452-6d71430527bc-0012
16/08/03 12:31:35 INFO MesosSchedulerBackend: Registered as framework ID ff2cf87e-3712-413f-a452-6d71430527bc-0012
16/08/03 12:31:35 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41218.
16/08/03 12:31:35 INFO NettyBlockTransferService: Server created on 41218
16/08/03 12:31:35 INFO BlockManagerMaster: Trying to register BlockManager
16/08/03 12:31:35 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.1.203:41218 with 12.4 GB RAM, BlockManagerId(driver, 192.168.1.203, 41218)
16/08/03 12:31:35 INFO BlockManagerMaster: Registered BlockManager
16/08/03 12:31:36 INFO SparkContext: Starting job: reduce at /home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py:38
16/08/03 12:31:36 INFO DAGScheduler: Got job 0 (reduce at /home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py:38) with 2 output partitions
16/08/03 12:31:36 INFO DAGScheduler: Final stage: ResultStage 0(reduce at /home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py:38)
16/08/03 12:31:36 INFO DAGScheduler: Parents of final stage: List()
16/08/03 12:31:36 INFO DAGScheduler: Missing parents: List()
16/08/03 12:31:36 INFO DAGScheduler: Submitting ResultStage 0 (PythonRDD[1] at reduce at /home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py:38), which has no missing parents
16/08/03 12:31:36 INFO MemoryStore: ensureFreeSpace(4272) called with curMem=0, maxMem=13335873454
16/08/03 12:31:36 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 4.2 KB, free 12.4 GB)
16/08/03 12:31:36 INFO MemoryStore: ensureFreeSpace(2792) called with curMem=4272, maxMem=13335873454
....
....
16/08/03 12:31:37 INFO DAGScheduler: Job 0 failed: reduce at /home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py:38, took 1.002633 s
Traceback (most recent call last):
  File "/home/jianxun/software/spark-2.0.0-bin-hadoop2.6/./examples/src/main/python/pimy.py", line 38, in <module>
    count = sc.parallelize(range(1, n + 1), partitions).map(f).reduce(add)
  File "/home/jianxun/software/spark-1.5.2-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 799, in reduce
  File "/home/jianxun/software/spark-1.5.2-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 773, in collect
  File "/home/jianxun/software/spark-1.5.2-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__
  File "/home/jianxun/software/spark-1.5.2-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value
py4j.protocol.Py4JJavaError16/08/03 12:31:37 INFO DAGScheduler: Executor lost: ff2cf87e-3712-413f-a452-6d71430527bc-S4 (epoch 3)
: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 7, node7): ExecutorLostFailure (executor ff2cf87e-3712-413f-a452-6d71430527bc-S4lost)
Driver stacktrace:
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1271)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1270)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1270)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
        at scala.Option.foreach(Option.scala:236)
        at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1496)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
        at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1824)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1837)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1850)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1921)
        at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:909)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
        at org.apache.spark.rdd.RDD.collect(RDD.scala:908)
        at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:405)
        at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
        at py4j.Gateway.invoke(Gateway.java:259)
        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:207)
        at java.lang.Thread.run(Thread.java:745)
16/08/03 12:31:37 INFO BlockManagerMasterEndpoint: Trying to remove executor ff2cf87e-3712-413f-a452-6d71430527bc-S4 from BlockManagerMaster.
16/08/03 12:31:37 INFO BlockManagerMaster: Removed ff2cf87e-3712-413f-a452-6d71430527bc-S4 successfully in removeExecutor
16/08/03 12:31:37 INFO DAGScheduler: Host added was in lost list earlier: node7
16/08/03 12:31:37 INFO SparkContext: Invoking stop() from shutdown hook
16/08/03 12:31:37 INFO SparkUI: Stopped Spark web UI at http://192.168.1.203:4040
16/08/03 12:31:37 INFO DAGScheduler: Stopping DAGScheduler
I0803 12:31:37.146209 32592 sched.cpp:1771] Asked to stop the driver
I0803 12:31:37.146414 32573 sched.cpp:1040] Stopping framework 'ff2cf87e-3712-413f-a452-6d71430527bc-0012'
16/08/03 12:31:37 INFO MesosSchedulerBackend: driver.run() returned with code DRIVER_STOPPED
16/08/03 12:31:37 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/08/03 12:31:37 INFO MemoryStore: MemoryStore cleared
16/08/03 12:31:37 INFO BlockManager: BlockManager stopped
16/08/03 12:31:37 INFO BlockManagerMaster: BlockManagerMaster stopped
16/08/03 12:31:37 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/08/03 12:31:37 INFO SparkContext: Successfully stopped SparkContext
16/08/03 12:31:37 INFO ShutdownHookManager: Shutdown hook called
16/08/03 12:31:37 INFO ShutdownHookManager: Deleting directory /data/sparktmp/spark-eba79d72-dd11-4d5d-a008-9964522fcc24/pyspark-02048aa7-deaf-4af5-adde-86732cd44324
16/08/03 12:31:37 INFO ShutdownHookManager: Deleting directory /data/sparktmp/spark-eba79d72-dd11-4d5d-a008-9964522fcc24

node7上的mesos.warning日志

Log file created at: 2016/08/03 12:31:36
Running on machine: node7
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
W0803 12:31:36.408701  5686 containerizer.cpp:988] Ignoring update for unknown container: 9910a15a-ec96-4e5a-91b9-58652b2bcaa5
W0803 12:31:36.409050  5686 containerizer.cpp:988] Ignoring update for unknown container: 9910a15a-ec96-4e5a-91b9-58652b2bcaa5
W0803 12:31:36.613108  5687 containerizer.cpp:988] Ignoring update for unknown container: 108436bb-429b-4214-9d9b-9fa452383093
W0803 12:31:36.613817  5691 containerizer.cpp:988] Ignoring update for unknown container: 108436bb-429b-4214-9d9b-9fa452383093
W0803 12:31:36.807909  5692 containerizer.cpp:988] Ignoring update for unknown container: 5c9abbdb-ee6a-4175-8087-d6d1dd1bd5ea
W0803 12:31:36.808281  5692 containerizer.cpp:988] Ignoring update for unknown container: 5c9abbdb-ee6a-4175-8087-d6d1dd1bd5ea
W0803 12:31:37.019579  5687 containerizer.cpp:988] Ignoring update for unknown container: 7a11174e-7774-453c-bdf7-5cbb5b4afcfa
W0803 12:31:37.020051  5693 containerizer.cpp:988] Ignoring update for unknown container: 7a11174e-7774-453c-bdf7-5cbb5b4afcfa
W0803 12:31:37.142438  5690 slave.cpp:1995] Cannot shut down unknown framework ff2cf87e-3712-413f-a452-6d71430527bc-0012

相关问题