spark作业将挂在集群上的df.show()上

taor4pac  于 2021-06-24  发布在  Hive
关注(0)|答案(0)|浏览(301)

我的群集环境:

Hadoop   2.9.2
Tez      0.9.2
Hive     2.3.4
Spark    2.4.2
Zeppelin 0.8.1

这是我试图通过齐柏林飞艇完成的spark工作的例子:

%spark.pyspark
print('Hello Zeppelin!')
df = spark.read.format('orc').load('hdfs://myHadoopMaster:9000/spark-warehouse/myDB.db/myTable')
print('Hello Zeppelin!')
df.show()

齐柏林飞艇解释器显示绿色,但在脚本执行时,它会无限期挂起,永远不会将控制权返回到齐柏林飞艇段落。
从齐柏林飞艇手动中止时,输出显示:

/ssd2/hadoop/tmp/nm-local-dir/usercache/hdfs/appcache/application_1565710990672_0005/container_1565710990672_0005_01_000001/tmp/zeppelin_pyspark-148589971355063902.py:179: UserWarning: Unable to load inline matplotlib backend, falling back to Agg
  warnings.warn("Unable to load inline matplotlib backend, "
Hello Zeppelin!
Hello Zeppelin!
Fail to execute line 9: df.show()
Traceback (most recent call last):
  File "/ssd2/hadoop/tmp/nm-local-dir/usercache/hdfs/appcache/application_1565710990672_0005/container_1565710990672_0005_01_000001/tmp/zeppelin_pyspark-148589971355063902.py", line 380, in <module>
    exec(code, _zcUserQueryNameSpace)
  File "<stdin>", line 9, in <module>
  File "/usr/local/spark/python/pyspark/sql/dataframe.py", line 378, in show
    print(self._jdf.showString(n, 20, vertical))
  File "/usr/local/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1255, in __call__
    answer = self.gateway_client.send_command(command)
  File "/usr/local/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 985, in send_command
    response = connection.send_command(command)
  File "/usr/local/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1152, in send_command
    answer = smart_decode(self.stream.readline()[:-1])
  File "/usr/lib64/python3.6/socket.py", line 586, in readinto
    return self._sock.recv_into(b)
  File "/usr/local/spark/python/pyspark/context.py", line 270, in signal_handler
    raise KeyboardInterrupt()
KeyboardInterrupt

matplotlib错误不应该是相关的,但是我不能完全确定这一点,而且它看起来非常模糊(与环境相关,但我没有使用matplotlib,而且它安装在所有群集计算机上)。
当使用手动部署时,spark作业运行平稳 spark-submit --master yarn --deploy-mode cluster example.py . 齐柏林飞艇上的Spark解释器正确设置了这两个设置。如果关闭hivecontext支持,也会发生同样的情况。当我使用%hive解释器时,段落中的sql代码在我加载到spark脚本中的同一个orc表上运行得非常好而且非常快。
我这里真的没有什么主意了。
这是我们随身携带的齐柏林飞艇集装箱的全部痕迹 yarn logs -applicationId application_1565863084087_0003 -log_files stdout 段落过程中止后:

INFO [2019-08-15 11:06:59,170] ({main} Logging.scala[logInfo]:54) - Using initial executors = 0, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
 INFO [2019-08-15 11:06:59,174] ({dispatcher-event-loop-7} Logging.scala[logInfo]:54) - ApplicationMaster registered as NettyRpcEndpointRef(spark://YarnAM@LOSLDAP01:43945)
 INFO [2019-08-15 11:06:59,191] ({main} Logging.scala[logInfo]:54) - Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals
 INFO [2019-08-15 11:06:59,192] ({pool-2-thread-2} Logging.scala[logInfo]:54) - SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
 INFO [2019-08-15 11:06:59,192] ({pool-2-thread-2} Logging.scala[logInfo]:54) - YarnClusterScheduler.postStartHook done
import org.apache.spark.SparkContext._
import spark.implicits._
import spark.sql
import org.apache.spark.sql.functions._
 INFO [2019-08-15 11:07:00,812] ({pool-2-thread-2} SparkShims.java[loadShims]:62) - Initializing shims for Spark 2.x
 INFO [2019-08-15 11:07:00,959] ({pool-2-thread-2} Py4JUtils.java[createGatewayServer]:44) - Launching GatewayServer at 127.0.0.1:38265
 INFO [2019-08-15 11:07:00,967] ({pool-2-thread-2} PySparkInterpreter.java[createGatewayServerAndStartScript]:265) - pythonExec: /bin/python3
 INFO [2019-08-15 11:07:00,971] ({pool-2-thread-2} PySparkInterpreter.java[setupPySparkEnv]:236) - PYTHONPATH: /usr/local/spark/python/lib/py4j-0.10.7-src.zip:/usr/local/spark/python/:/bin/python3.6:/ssd2/hadoop/tmp/nm-local-dir/usercache/hdfs/appcache/application_1565863084087_0003/container_1565863084087_0003_01_000001/pyspark.zip:/ssd2/hadoop/tmp/nm-local-dir/usercache/hdfs/appcache/application_1565863084087_0003/container_1565863084087_0003_01_000001/py4j-0.10.7-src.zip
 INFO [2019-08-15 11:07:01,264] ({Thread-28} Logging.scala[logInfo]:54) - loading hive config file: file:/usr/local/hadoop/etc/hadoop/hive-site.xml
 INFO [2019-08-15 11:07:01,291] ({Thread-28} Logging.scala[logInfo]:54) - Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('/spark-warehouse').
 INFO [2019-08-15 11:07:01,292] ({Thread-28} Logging.scala[logInfo]:54) - Warehouse path is '/spark-warehouse'.
 INFO [2019-08-15 11:07:01,297] ({Thread-28} Logging.scala[logInfo]:54) - Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL.
 INFO [2019-08-15 11:07:01,297] ({Thread-28} ContextHandler.java[doStart]:781) - Started o.s.j.s.ServletContextHandler@65047eb3{/SQL,null,AVAILABLE,@Spark}
 INFO [2019-08-15 11:07:01,298] ({Thread-28} Logging.scala[logInfo]:54) - Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/json.
 INFO [2019-08-15 11:07:01,298] ({Thread-28} ContextHandler.java[doStart]:781) - Started o.s.j.s.ServletContextHandler@65be1050{/SQL/json,null,AVAILABLE,@Spark}
 INFO [2019-08-15 11:07:01,298] ({Thread-28} Logging.scala[logInfo]:54) - Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution.
 INFO [2019-08-15 11:07:01,298] ({Thread-28} ContextHandler.java[doStart]:781) - Started o.s.j.s.ServletContextHandler@273dc56a{/SQL/execution,null,AVAILABLE,@Spark}
 INFO [2019-08-15 11:07:01,299] ({Thread-28} Logging.scala[logInfo]:54) - Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution/json.
 INFO [2019-08-15 11:07:01,299] ({Thread-28} ContextHandler.java[doStart]:781) - Started o.s.j.s.ServletContextHandler@b619c6{/SQL/execution/json,null,AVAILABLE,@Spark}
 INFO [2019-08-15 11:07:01,299] ({Thread-28} Logging.scala[logInfo]:54) - Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /static/sql.
 INFO [2019-08-15 11:07:01,300] ({Thread-28} ContextHandler.java[doStart]:781) - Started o.s.j.s.ServletContextHandler@565760a3{/static/sql,null,AVAILABLE,@Spark}
 INFO [2019-08-15 11:07:01,514] ({Thread-28} Logging.scala[logInfo]:54) - Registered StateStoreCoordinator endpoint
 INFO [2019-08-15 11:07:01,655] ({Thread-28} OrcCodecPool.java[getCodec]:56) - Got brand-new codec SNAPPY
 INFO [2019-08-15 11:07:02,185] ({Thread-28} Logging.scala[logInfo]:54) - Pruning directories with: 
 INFO [2019-08-15 11:07:02,189] ({Thread-28} Logging.scala[logInfo]:54) - Post-Scan Filters: 
 INFO [2019-08-15 11:07:02,191] ({Thread-28} Logging.scala[logInfo]:54) - Output Data Schema: struct<asset_identifier_id: decimal(19,0), deal_part_id: decimal(19,0), deal_id: decimal(19,0), lifecycle_type: string, created_by: string ... 87 more fields>
 INFO [2019-08-15 11:07:02,196] ({Thread-28} Logging.scala[logInfo]:54) - Pushed Filters: 
 INFO [2019-08-15 11:07:02,515] ({Thread-28} Logging.scala[logInfo]:54) - Code generated in 179.440722 ms
 INFO [2019-08-15 11:07:02,689] ({Thread-28} Logging.scala[logInfo]:54) - Generated method too long to be JIT compiled: org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext is 8046 bytes
 INFO [2019-08-15 11:07:02,689] ({Thread-28} Logging.scala[logInfo]:54) - Code generated in 111.550626 ms
 INFO [2019-08-15 11:07:02,721] ({Thread-28} Logging.scala[logInfo]:54) - Block broadcast_0 stored as values in memory (estimated size 316.7 KB, free 5.2 GB)
 INFO [2019-08-15 11:07:02,856] ({Thread-28} Logging.scala[logInfo]:54) - Block broadcast_0_piece0 stored as bytes in memory (estimated size 27.7 KB, free 5.2 GB)
 INFO [2019-08-15 11:07:02,858] ({dispatcher-event-loop-6} Logging.scala[logInfo]:54) - Added broadcast_0_piece0 in memory on LOSLDAP01:34535 (size: 27.7 KB, free: 5.2 GB)
 INFO [2019-08-15 11:07:02,859] ({Thread-28} Logging.scala[logInfo]:54) - Created broadcast 0 from showString at NativeMethodAccessorImpl.java:0
 INFO [2019-08-15 11:07:02,861] ({Thread-28} Logging.scala[logInfo]:54) - Planning scan with bin packing, max size: 134217728 bytes, open cost is considered as scanning 4194304 bytes.
 INFO [2019-08-15 11:07:02,903] ({Thread-28} Logging.scala[logInfo]:54) - Starting job: showString at NativeMethodAccessorImpl.java:0
 INFO [2019-08-15 11:07:02,911] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Got job 0 (showString at NativeMethodAccessorImpl.java:0) with 1 output partitions
 INFO [2019-08-15 11:07:02,911] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Final stage: ResultStage 0 (showString at NativeMethodAccessorImpl.java:0)
 INFO [2019-08-15 11:07:02,911] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Parents of final stage: List()
 INFO [2019-08-15 11:07:02,912] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Missing parents: List()
 INFO [2019-08-15 11:07:02,915] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Submitting ResultStage 0 (MapPartitionsRDD[3] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents
 INFO [2019-08-15 11:07:02,943] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Block broadcast_1 stored as values in memory (estimated size 67.0 KB, free 5.2 GB)
 INFO [2019-08-15 11:07:02,946] ({dispatcher-event-loop-7} Logging.scala[logInfo]:54) - Driver requested a total number of 1 executor(s).
 INFO [2019-08-15 11:07:02,947] ({spark-dynamic-executor-allocation} Logging.scala[logInfo]:54) - Requesting 1 new executor because tasks are backlogged (new desired total will be 1)
 INFO [2019-08-15 11:07:02,947] ({Reporter} Logging.scala[logInfo]:54) - Will request 1 executor container(s), each with 2 core(s) and 11264 MB memory (including 1024 MB of overhead)
 INFO [2019-08-15 11:07:02,949] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Block broadcast_1_piece0 stored as bytes in memory (estimated size 17.3 KB, free 5.2 GB)
 INFO [2019-08-15 11:07:02,949] ({dispatcher-event-loop-1} Logging.scala[logInfo]:54) - Added broadcast_1_piece0 in memory on LOSLDAP01:34535 (size: 17.3 KB, free: 5.2 GB)
 INFO [2019-08-15 11:07:02,950] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Created broadcast 1 from broadcast at DAGScheduler.scala:1161
 INFO [2019-08-15 11:07:02,955] ({Reporter} Logging.scala[logInfo]:54) - Submitted container request for host LOSLDAP01.
 INFO [2019-08-15 11:07:02,958] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[3] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0))
 INFO [2019-08-15 11:07:02,958] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Adding task set 0.0 with 1 tasks
 INFO [2019-08-15 11:07:04,376] ({Reporter} AMRMClientImpl.java[populateNMTokens]:361) - Received new token for : LOSLDAP01:43743
 INFO [2019-08-15 11:07:04,378] ({Reporter} Logging.scala[logInfo]:54) - Launching container container_1565863084087_0003_01_000002 on host LOSLDAP01 for executor with ID 1
 INFO [2019-08-15 11:07:04,379] ({Reporter} Logging.scala[logInfo]:54) - Received 1 containers from YARN, launching executors on 1 of them.
 INFO [2019-08-15 11:07:04,381] ({ContainerLauncher-0} ContainerManagementProtocolProxy.java[<init>]:81) - yarn.client.max-cached-nodemanagers-proxies : 0
 INFO [2019-08-15 11:07:04,389] ({ContainerLauncher-0} ContainerManagementProtocolProxy.java[newProxy]:260) - Opening proxy : LOSLDAP01:43743
 INFO [2019-08-15 11:07:06,197] ({dispatcher-event-loop-3} Logging.scala[logInfo]:54) - Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.94.245:43952) with ID 1
 INFO [2019-08-15 11:07:06,203] ({spark-listener-group-executorManagement} Logging.scala[logInfo]:54) - New executor 1 has registered (new total is 1)
 INFO [2019-08-15 11:07:06,207] ({dispatcher-event-loop-3} Logging.scala[logInfo]:54) - Starting task 0.0 in stage 0.0 (TID 0, LOSLDAP01, executor 1, partition 0, NODE_LOCAL, 8372 bytes)
 INFO [2019-08-15 11:07:06,253] ({dispatcher-event-loop-1} Logging.scala[logInfo]:54) - Registering block manager LOSLDAP01:37577 with 5.2 GB RAM, BlockManagerId(1, LOSLDAP01, 37577, None)
 INFO [2019-08-15 11:10:00,375] ({pool-1-thread-3} RemoteInterpreterServer.java[cancel]:681) - cancel org.apache.zeppelin.spark.PySparkInterpreter 20190723-154043_571949868
 INFO [2019-08-15 11:10:00,375] ({pool-1-thread-3} Logging.scala[logInfo]:54) - Asked to cancel job group zeppelin-2EFPJRKZB-20190723-154043_571949868
 INFO [2019-08-15 11:10:00,375] ({pool-1-thread-3} PySparkInterpreter.java[interrupt]:508) - Sending SIGINT signal to PID : 13417
 INFO [2019-08-15 11:10:00,377] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Cancelling stage 0
 INFO [2019-08-15 11:10:00,378] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Killing all running tasks in stage 0: Stage cancelled
 INFO [2019-08-15 11:10:00,380] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Stage 0 was cancelled
 INFO [2019-08-15 11:10:00,381] ({pool-2-thread-2} SchedulerFactory.java[jobFinished]:120) - Job 20190723-154043_571949868 finished by scheduler interpreter_1318028634
 INFO [2019-08-15 11:10:00,382] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - ResultStage 0 (showString at NativeMethodAccessorImpl.java:0) failed in 177.458 s due to Job 0 cancelled part of cancelled job group zeppelin-2EFPJRKZB-20190723-154043_571949868
 INFO [2019-08-15 11:10:00,384] ({Thread-28} Logging.scala[logInfo]:54) - Job 0 failed: showString at NativeMethodAccessorImpl.java:0, took 177.481167 s
 INFO [2019-08-15 11:10:00,442] ({dispatcher-event-loop-3} Logging.scala[logInfo]:54) - Driver requested a total number of 0 executor(s).
 INFO [2019-08-15 11:12:43,808] ({pool-2-thread-2} SchedulerFactory.java[jobStarted]:114) - Job 20190815-111050_1021017643 started by scheduler interpreter_1318028634
 INFO [2019-08-15 11:12:43,882] ({Spark Context Cleaner} Logging.scala[logInfo]:54) - Cleaned accumulator 1
 INFO [2019-08-15 11:12:43,882] ({Spark Context Cleaner} Logging.scala[logInfo]:54) - Cleaned accumulator 2
 INFO [2019-08-15 11:12:43,882] ({Spark Context Cleaner} Logging.scala[logInfo]:54) - Cleaned accumulator 5
 INFO [2019-08-15 11:12:43,892] ({dispatcher-event-loop-5} Logging.scala[logInfo]:54) - Removed broadcast_0_piece0 on LOSLDAP01:34535 in memory (size: 27.7 KB, free: 5.2 GB)
 INFO [2019-08-15 11:12:43,896] ({Thread-37} Logging.scala[logInfo]:54) - Pruning directories with: 
 INFO [2019-08-15 11:12:43,896] ({Spark Context Cleaner} Logging.scala[logInfo]:54) - Cleaned accumulator 3
 INFO [2019-08-15 11:12:43,896] ({Spark Context Cleaner} Logging.scala[logInfo]:54) - Cleaned accumulator 4
 INFO [2019-08-15 11:12:43,897] ({Thread-37} Logging.scala[logInfo]:54) - Post-Scan Filters: 
 INFO [2019-08-15 11:12:43,897] ({Thread-37} Logging.scala[logInfo]:54) - Output Data Schema: struct<asset_identifier_id: decimal(19,0), deal_part_id: decimal(19,0), deal_id: decimal(19,0), lifecycle_type: string, created_by: string ... 87 more fields>
 INFO [2019-08-15 11:12:43,897] ({Thread-37} Logging.scala[logInfo]:54) - Pushed Filters: 
 INFO [2019-08-15 11:12:43,940] ({Thread-37} Logging.scala[logInfo]:54) - Block broadcast_2 stored as values in memory (estimated size 316.7 KB, free 5.2 GB)
 INFO [2019-08-15 11:12:43,954] ({Thread-37} Logging.scala[logInfo]:54) - Block broadcast_2_piece0 stored as bytes in memory (estimated size 27.7 KB, free 5.2 GB)
 INFO [2019-08-15 11:12:43,954] ({dispatcher-event-loop-4} Logging.scala[logInfo]:54) - Added broadcast_2_piece0 in memory on LOSLDAP01:34535 (size: 27.7 KB, free: 5.2 GB)
 INFO [2019-08-15 11:12:43,955] ({Thread-37} Logging.scala[logInfo]:54) - Created broadcast 2 from showString at NativeMethodAccessorImpl.java:0
 INFO [2019-08-15 11:12:43,955] ({Thread-37} Logging.scala[logInfo]:54) - Planning scan with bin packing, max size: 134217728 bytes, open cost is considered as scanning 4194304 bytes.
 INFO [2019-08-15 11:12:43,960] ({Thread-37} Logging.scala[logInfo]:54) - Starting job: showString at NativeMethodAccessorImpl.java:0
 INFO [2019-08-15 11:12:43,961] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Got job 1 (showString at NativeMethodAccessorImpl.java:0) with 1 output partitions
 INFO [2019-08-15 11:12:43,961] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Final stage: ResultStage 1 (showString at NativeMethodAccessorImpl.java:0)
 INFO [2019-08-15 11:12:43,961] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Parents of final stage: List()
 INFO [2019-08-15 11:12:43,961] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Missing parents: List()
 INFO [2019-08-15 11:12:43,961] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Submitting ResultStage 1 (MapPartitionsRDD[7] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents
 INFO [2019-08-15 11:12:43,963] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Block broadcast_3 stored as values in memory (estimated size 67.0 KB, free 5.2 GB)
 INFO [2019-08-15 11:12:43,968] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Block broadcast_3_piece0 stored as bytes in memory (estimated size 17.3 KB, free 5.2 GB)
 INFO [2019-08-15 11:12:43,969] ({dispatcher-event-loop-2} Logging.scala[logInfo]:54) - Added broadcast_3_piece0 in memory on LOSLDAP01:34535 (size: 17.3 KB, free: 5.2 GB)
 INFO [2019-08-15 11:12:43,969] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Created broadcast 3 from broadcast at DAGScheduler.scala:1161
 INFO [2019-08-15 11:12:43,969] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Submitting 1 missing tasks from ResultStage 1 (MapPartitionsRDD[7] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0))
 INFO [2019-08-15 11:12:43,969] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Adding task set 1.0 with 1 tasks
 INFO [2019-08-15 11:12:43,970] ({dispatcher-event-loop-0} Logging.scala[logInfo]:54) - Starting task 0.0 in stage 1.0 (TID 1, LOSLDAP01, executor 1, partition 0, NODE_LOCAL, 8372 bytes)
 INFO [2019-08-15 11:17:24,340] ({pool-1-thread-3} RemoteInterpreterServer.java[cancel]:681) - cancel org.apache.zeppelin.spark.PySparkInterpreter 20190815-111050_1021017643
 INFO [2019-08-15 11:17:24,341] ({pool-1-thread-3} Logging.scala[logInfo]:54) - Asked to cancel job group zeppelin-2EHSN53Z9-20190815-111050_1021017643
 INFO [2019-08-15 11:17:24,341] ({pool-1-thread-3} PySparkInterpreter.java[interrupt]:508) - Sending SIGINT signal to PID : 13417
 INFO [2019-08-15 11:17:24,342] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Cancelling stage 1
 INFO [2019-08-15 11:17:24,342] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Killing all running tasks in stage 1: Stage cancelled
 INFO [2019-08-15 11:17:24,342] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Stage 1 was cancelled
 INFO [2019-08-15 11:17:24,343] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - ResultStage 1 (showString at NativeMethodAccessorImpl.java:0) failed in 280.381 s due to Job 1 cancelled part of cancelled job group zeppelin-2EHSN53Z9-20190815-111050_1021017643
 INFO [2019-08-15 11:17:24,344] ({Thread-37} Logging.scala[logInfo]:54) - Job 1 failed: showString at NativeMethodAccessorImpl.java:0, took 280.382961 s
 INFO [2019-08-15 11:17:24,349] ({pool-2-thread-2} SchedulerFactory.java[jobFinished]:120) - Job 20190815-111050_1021017643 finished by scheduler interpreter_1318028634
 INFO [2019-08-15 11:23:39,848] ({pool-2-thread-4} SchedulerFactory.java[jobStarted]:114) - Job 20190815-111050_1021017643 started by scheduler interpreter_1318028634
 INFO [2019-08-15 11:23:39,940] ({Thread-53} Logging.scala[logInfo]:54) - Pruning directories with: 
 INFO [2019-08-15 11:23:39,941] ({Thread-53} Logging.scala[logInfo]:54) - Post-Scan Filters: 
 INFO [2019-08-15 11:23:39,941] ({Thread-53} Logging.scala[logInfo]:54) - Output Data Schema: struct<asset_identifier_id: decimal(19,0), deal_part_id: decimal(19,0), deal_id: decimal(19,0), lifecycle_type: string, created_by: string ... 87 more fields>
 INFO [2019-08-15 11:23:39,941] ({Thread-53} Logging.scala[logInfo]:54) - Pushed Filters: 
 INFO [2019-08-15 11:23:39,975] ({Thread-53} Logging.scala[logInfo]:54) - Block broadcast_4 stored as values in memory (estimated size 316.7 KB, free 5.2 GB)
 INFO [2019-08-15 11:23:39,987] ({Thread-53} Logging.scala[logInfo]:54) - Block broadcast_4_piece0 stored as bytes in memory (estimated size 27.7 KB, free 5.2 GB)
 INFO [2019-08-15 11:23:39,987] ({dispatcher-event-loop-6} Logging.scala[logInfo]:54) - Added broadcast_4_piece0 in memory on LOSLDAP01:34535 (size: 27.7 KB, free: 5.2 GB)
 INFO [2019-08-15 11:23:39,987] ({Thread-53} Logging.scala[logInfo]:54) - Created broadcast 4 from showString at NativeMethodAccessorImpl.java:0
 INFO [2019-08-15 11:23:39,987] ({Thread-53} Logging.scala[logInfo]:54) - Planning scan with bin packing, max size: 134217728 bytes, open cost is considered as scanning 4194304 bytes.
 INFO [2019-08-15 11:23:39,992] ({Thread-53} Logging.scala[logInfo]:54) - Starting job: showString at NativeMethodAccessorImpl.java:0
 INFO [2019-08-15 11:23:39,993] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Got job 2 (showString at NativeMethodAccessorImpl.java:0) with 1 output partitions
 INFO [2019-08-15 11:23:39,993] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Final stage: ResultStage 2 (showString at NativeMethodAccessorImpl.java:0)
 INFO [2019-08-15 11:23:39,993] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Parents of final stage: List()
 INFO [2019-08-15 11:23:39,993] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Missing parents: List()
 INFO [2019-08-15 11:23:39,993] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Submitting ResultStage 2 (MapPartitionsRDD[11] at showString at NativeMethodAccessorImpl.java:0), which has no missing parents
 INFO [2019-08-15 11:23:39,995] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Block broadcast_5 stored as values in memory (estimated size 67.0 KB, free 5.2 GB)
 INFO [2019-08-15 11:23:39,998] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Block broadcast_5_piece0 stored as bytes in memory (estimated size 17.3 KB, free 5.2 GB)
 INFO [2019-08-15 11:23:39,998] ({dispatcher-event-loop-1} Logging.scala[logInfo]:54) - Added broadcast_5_piece0 in memory on LOSLDAP01:34535 (size: 17.3 KB, free: 5.2 GB)
 INFO [2019-08-15 11:23:39,998] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Created broadcast 5 from broadcast at DAGScheduler.scala:1161
 INFO [2019-08-15 11:23:39,999] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Submitting 1 missing tasks from ResultStage 2 (MapPartitionsRDD[11] at showString at NativeMethodAccessorImpl.java:0) (first 15 tasks are for partitions Vector(0))
 INFO [2019-08-15 11:23:39,999] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Adding task set 2.0 with 1 tasks
 INFO [2019-08-15 11:23:41,054] ({dispatcher-event-loop-2} Logging.scala[logInfo]:54) - Driver requested a total number of 1 executor(s).
 INFO [2019-08-15 11:23:41,055] ({spark-dynamic-executor-allocation} Logging.scala[logInfo]:54) - Requesting 1 new executor because tasks are backlogged (new desired total will be 1)
 INFO [2019-08-15 11:30:31,875] ({pool-1-thread-3} RemoteInterpreterServer.java[cancel]:681) - cancel org.apache.zeppelin.spark.PySparkInterpreter 20190815-111050_1021017643
 INFO [2019-08-15 11:30:31,876] ({pool-1-thread-3} Logging.scala[logInfo]:54) - Asked to cancel job group zeppelin-2EHSN53Z9-20190815-111050_1021017643
 INFO [2019-08-15 11:30:31,876] ({pool-1-thread-3} PySparkInterpreter.java[interrupt]:508) - Sending SIGINT signal to PID : 13417
 INFO [2019-08-15 11:30:31,877] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Cancelling stage 2
 INFO [2019-08-15 11:30:31,877] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Killing all running tasks in stage 2: Stage cancelled
 INFO [2019-08-15 11:30:31,880] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Removed TaskSet 2.0, whose tasks have all completed, from pool 
 INFO [2019-08-15 11:30:31,880] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - Stage 2 was cancelled
 INFO [2019-08-15 11:30:31,881] ({dag-scheduler-event-loop} Logging.scala[logInfo]:54) - ResultStage 2 (showString at NativeMethodAccessorImpl.java:0) failed in 411.886 s due to Job 2 cancelled part of cancelled job group zeppelin-2EHSN53Z9-20190815-111050_1021017643
 INFO [2019-08-15 11:30:31,881] ({Thread-53} Logging.scala[logInfo]:54) - Job 2 failed: showString at NativeMethodAccessorImpl.java:0, took 411.888853 s
 INFO [2019-08-15 11:30:31,884] ({pool-2-thread-4} SchedulerFactory.java[jobFinished]:120) - Job 20190815-111050_1021017643 finished by scheduler interpreter_1318028634
 INFO [2019-08-15 11:30:31,888] ({dispatcher-event-loop-3} Logging.scala[logInfo]:54) - Driver requested a total number of 0 executor(s).
 WARN [2019-08-15 11:30:34,299] ({SparkUI-46} AmIpFilter.java[doFilter]:157) - Could not find proxy-user cookie, so user will not be set
 WARN [2019-08-15 11:30:34,413] ({SparkUI-52} AmIpFilter.java[doFilter]:157) - Could not find proxy-user cookie, so user will not be set
 WARN [2019-08-15 11:30:34,424] ({SparkUI-51} AmIpFilter.java[doFilter]:157) - Could not find proxy-user cookie, so user will not be set
 WARN [2019-08-15 11:30:34,424] ({SparkUI-46} AmIpFilter.java[doFilter]:157) - Could not find proxy-user cookie, so user will not be set
 WARN [2019-08-15 11:30:34,426] ({SparkUI-160} AmIpFilter.java[doFilter]:157) - Could not find proxy-user cookie, so user will not be set
 WARN [2019-08-15 11:30:34,428] ({SparkUI-46} AmIpFilter.java[doFilter]:157) - Could not find proxy-user cookie, so user will not be set
 WARN [2019-08-15 11:30:34,439] ({SparkUI-47} AmIpFilter.java[doFilter]:157) - Could not find proxy-user cookie, so user will not be set
 WARN [2019-08-15 11:30:34,439] ({SparkUI-160} AmIpFilter.java[doFilter]:157) - Could not find proxy-user cookie, so user will not be set
 WARN [2019-08-15 11:30:34,439] ({SparkUI-51} AmIpFilter.java[doFilter]:157) - Could not find proxy-user cookie, so user will not be set
 WARN [2019-08-15 11:30:34,449] ({SparkUI-47} AmIpFilter.java[doFilter]:157) - Could not find proxy-user cookie, so user will not be set

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题