历史服务器正确启动并显示 "No completed applications found!"
消息何时 /tmp/spark-events
目录为空。一旦spark流媒体应用程序(还没有测试过其他应用程序)生成的文件夹中有事件日志,历史服务器就会卡住,显示 "Loading history summary..."
弹出窗口。如果我将spark安装回滚到2.4.5,一切正常。
spark版本:3.0.1和3.0.0
hadoop版本:2.10.0、2.10.1、3.1.4
java版本:openjdk 1.8.0\U 272
操作系统:centos 7.7
流式处理作业提交时间: spark-submit --class MyClass --master yarn --deploy-mode cluster --jars /usr/local/hadoop/share/hadoop/libs/*.jar myclass.jar
spark-defaults.conf:默认值:
spark.eventLog.dir=/tmp/spark-events
spark.eventLog.enabled=true
spark.eventLog.rolling.enabled=true
spark.ui.enabled=true
spark.yarn.stagingDir=hdfs://10.0.1.4:54310/
下面是用 jstack
历史服务器pid的
2020-11-17 22:23:37
Full thread dump OpenJDK 64-Bit Server VM (25.272-b10 mixed mode):
"Attach Listener" #24 daemon prio=9 os_prio=0 tid=0x00007f9680001000 nid=0x6016 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"log-replay-executor-0" #23 daemon prio=5 os_prio=0 tid=0x00007f96700e4000 nid=0x438c waiting on condition [0x00007f969c20d000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c00eb6e0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"-JettyScheduler" #22 daemon prio=5 os_prio=0 tid=0x00007f9664008000 nid=0x4051 waiting on condition [0x00007f969c50e000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c05b2b48> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1081)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"spark-history-task-0" #21 daemon prio=5 os_prio=0 tid=0x00007f96b7206800 nid=0x402c waiting on condition [0x00007f969c60f000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c00eb2a0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"qtp1827171553-20" #20 daemon prio=5 os_prio=0 tid=0x00007f96b719f000 nid=0x402b waiting on condition [0x00007f969c910000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c04d2a50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at org.sparkproject.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.idleJobPoll(QueuedThreadPool.java:858)
at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:783)
at java.lang.Thread.run(Thread.java:748)
"qtp1827171553-19" #19 daemon prio=5 os_prio=0 tid=0x00007f96b719d800 nid=0x402a waiting on condition [0x00007f969ca11000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c04d2a50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at org.sparkproject.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.idleJobPoll(QueuedThreadPool.java:858)
at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:783)
at java.lang.Thread.run(Thread.java:748)
"qtp1827171553-18" #18 daemon prio=5 os_prio=0 tid=0x00007f96b719b800 nid=0x4029 waiting on condition [0x00007f969cb12000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c04d2a50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at org.sparkproject.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.idleJobPoll(QueuedThreadPool.java:858)
at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:783)
at java.lang.Thread.run(Thread.java:748)
"qtp1827171553-17" #17 daemon prio=5 os_prio=0 tid=0x00007f96b7199800 nid=0x4028 waiting on condition [0x00007f969cc13000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c04d2a50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at org.sparkproject.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.idleJobPoll(QueuedThreadPool.java:858)
at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:783)
at java.lang.Thread.run(Thread.java:748)
"qtp1827171553-16" #16 daemon prio=5 os_prio=0 tid=0x00007f96b7198000 nid=0x4027 waiting on condition [0x00007f969cd14000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c04d2a50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at org.sparkproject.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.idleJobPoll(QueuedThreadPool.java:858)
at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:783)
at java.lang.Thread.run(Thread.java:748)
"qtp1827171553-15" #15 daemon prio=5 os_prio=0 tid=0x00007f96b7196000 nid=0x4026 waiting on condition [0x00007f969ce15000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c04d2a50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at org.sparkproject.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.idleJobPoll(QueuedThreadPool.java:858)
at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:783)
at java.lang.Thread.run(Thread.java:748)
"qtp1827171553-14-acceptor-0@5ac8e37e-ServerConnector@3f270e0a{HTTP/1.1,[http/1.1]}{0.0.0.0:18080}" #14 daemon prio=3 os_prio=0 tid=0x00007f96b7194000 nid=0x4025 runnable [0x00007f969cf16000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:421)
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:249)
- locked <0x00000000c04be1c8> (a java.lang.Object)
at org.sparkproject.jetty.server.ServerConnector.accept(ServerConnector.java:385)
at org.sparkproject.jetty.server.AbstractConnector$Acceptor.run(AbstractConnector.java:648)
at org.sparkproject.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:698)
at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:804)
at java.lang.Thread.run(Thread.java:748)
"qtp1827171553-13" #13 daemon prio=5 os_prio=0 tid=0x00007f96b7192800 nid=0x4024 runnable [0x00007f969d017000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <0x00000000c05b11a8> (a sun.nio.ch.Util$3)
- locked <0x00000000c05b1198> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000000c054db00> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101)
at org.sparkproject.jetty.io.ManagedSelector$SelectorProducer.select(ManagedSelector.java:464)
at org.sparkproject.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:401)
at org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:357)
at org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:181)
at org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
at org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
at org.sparkproject.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
at org.sparkproject.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:698)
at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:804)
at java.lang.Thread.run(Thread.java:748)
"org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner" #11 daemon prio=5 os_prio=0 tid=0x00007f96b48d0000 nid=0x4023 in Object.wait() [0x00007f969d71a000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000c0006b00> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
- locked <0x00000000c0006b00> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
at org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3760)
at java.lang.Thread.run(Thread.java:748)
"Service Thread" #7 daemon prio=9 os_prio=0 tid=0x00007f96b4134800 nid=0x401c runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007f96b412a000 nid=0x401b waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007f96b4126800 nid=0x401a waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007f96b4119800 nid=0x4019 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f96b40ec800 nid=0x4018 in Object.wait() [0x00007f969fdfc000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000c0005050> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
- locked <0x00000000c0005050> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:216)
"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f96b40e8000 nid=0x4017 in Object.wait() [0x00007f969fefd000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000c001e610> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x00000000c001e610> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
"main" #1 prio=5 os_prio=0 tid=0x00007f96b4058000 nid=0x4013 waiting on condition [0x00007f96be172000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:308)
at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala)
"VM Thread" os_prio=0 tid=0x00007f96b40de000 nid=0x4016 runnable
"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f96b406b000 nid=0x4014 runnable
"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f96b406d000 nid=0x4015 runnable
"VM Periodic Task Thread" os_prio=0 tid=0x00007f96b4137000 nid=0x401d waiting on condition
JNI global references: 1384
我在spark dev jira上提出了下面的问题,但是我想在这里检查一下是否有其他人看到过这个问题,并且可能已经找到了解决方法。任何帮助都将不胜感激
https://issues.apache.org/jira/browse/spark-33470
暂无答案!
目前还没有任何答案,快来回答吧!