spark history server卡在“加载历史摘要…”

u5rb5r59  于 2021-05-17  发布在  Spark
关注(0)|答案(0)|浏览(443)

历史服务器正确启动并显示 "No completed applications found!" 消息何时 /tmp/spark-events 目录为空。一旦spark流媒体应用程序(还没有测试过其他应用程序)生成的文件夹中有事件日志,历史服务器就会卡住,显示 "Loading history summary..." 弹出窗口。如果我将spark安装回滚到2.4.5,一切正常。
spark版本:3.0.1和3.0.0
hadoop版本:2.10.0、2.10.1、3.1.4
java版本:openjdk 1.8.0\U 272
操作系统:centos 7.7
流式处理作业提交时间: spark-submit --class MyClass --master yarn --deploy-mode cluster --jars /usr/local/hadoop/share/hadoop/libs/*.jar myclass.jar spark-defaults.conf:默认值:

spark.eventLog.dir=/tmp/spark-events
spark.eventLog.enabled=true
spark.eventLog.rolling.enabled=true
spark.ui.enabled=true
spark.yarn.stagingDir=hdfs://10.0.1.4:54310/

下面是用 jstack 历史服务器pid的

2020-11-17 22:23:37
Full thread dump OpenJDK 64-Bit Server VM (25.272-b10 mixed mode):

"Attach Listener" #24 daemon prio=9 os_prio=0 tid=0x00007f9680001000 nid=0x6016 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"log-replay-executor-0" #23 daemon prio=5 os_prio=0 tid=0x00007f96700e4000 nid=0x438c waiting on condition [0x00007f969c20d000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000c00eb6e0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
    at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

"-JettyScheduler" #22 daemon prio=5 os_prio=0 tid=0x00007f9664008000 nid=0x4051 waiting on condition [0x00007f969c50e000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000c05b2b48> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
    at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1081)
    at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

"spark-history-task-0" #21 daemon prio=5 os_prio=0 tid=0x00007f96b7206800 nid=0x402c waiting on condition [0x00007f969c60f000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000c00eb2a0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
    at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

"qtp1827171553-20" #20 daemon prio=5 os_prio=0 tid=0x00007f96b719f000 nid=0x402b waiting on condition [0x00007f969c910000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000c04d2a50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at org.sparkproject.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
    at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.idleJobPoll(QueuedThreadPool.java:858)
    at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:783)
    at java.lang.Thread.run(Thread.java:748)

"qtp1827171553-19" #19 daemon prio=5 os_prio=0 tid=0x00007f96b719d800 nid=0x402a waiting on condition [0x00007f969ca11000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000c04d2a50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at org.sparkproject.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
    at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.idleJobPoll(QueuedThreadPool.java:858)
    at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:783)
    at java.lang.Thread.run(Thread.java:748)

"qtp1827171553-18" #18 daemon prio=5 os_prio=0 tid=0x00007f96b719b800 nid=0x4029 waiting on condition [0x00007f969cb12000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000c04d2a50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at org.sparkproject.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
    at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.idleJobPoll(QueuedThreadPool.java:858)
    at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:783)
    at java.lang.Thread.run(Thread.java:748)

"qtp1827171553-17" #17 daemon prio=5 os_prio=0 tid=0x00007f96b7199800 nid=0x4028 waiting on condition [0x00007f969cc13000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000c04d2a50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at org.sparkproject.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
    at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.idleJobPoll(QueuedThreadPool.java:858)
    at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:783)
    at java.lang.Thread.run(Thread.java:748)

"qtp1827171553-16" #16 daemon prio=5 os_prio=0 tid=0x00007f96b7198000 nid=0x4027 waiting on condition [0x00007f969cd14000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000c04d2a50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at org.sparkproject.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
    at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.idleJobPoll(QueuedThreadPool.java:858)
    at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:783)
    at java.lang.Thread.run(Thread.java:748)

"qtp1827171553-15" #15 daemon prio=5 os_prio=0 tid=0x00007f96b7196000 nid=0x4026 waiting on condition [0x00007f969ce15000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000c04d2a50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at org.sparkproject.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
    at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.idleJobPoll(QueuedThreadPool.java:858)
    at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:783)
    at java.lang.Thread.run(Thread.java:748)

"qtp1827171553-14-acceptor-0@5ac8e37e-ServerConnector@3f270e0a{HTTP/1.1,[http/1.1]}{0.0.0.0:18080}" #14 daemon prio=3 os_prio=0 tid=0x00007f96b7194000 nid=0x4025 runnable [0x00007f969cf16000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:421)
    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:249)
    - locked <0x00000000c04be1c8> (a java.lang.Object)
    at org.sparkproject.jetty.server.ServerConnector.accept(ServerConnector.java:385)
    at org.sparkproject.jetty.server.AbstractConnector$Acceptor.run(AbstractConnector.java:648)
    at org.sparkproject.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:698)
    at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:804)
    at java.lang.Thread.run(Thread.java:748)

"qtp1827171553-13" #13 daemon prio=5 os_prio=0 tid=0x00007f96b7192800 nid=0x4024 runnable [0x00007f969d017000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
    at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
    - locked <0x00000000c05b11a8> (a sun.nio.ch.Util$3)
    - locked <0x00000000c05b1198> (a java.util.Collections$UnmodifiableSet)
    - locked <0x00000000c054db00> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101)
    at org.sparkproject.jetty.io.ManagedSelector$SelectorProducer.select(ManagedSelector.java:464)
    at org.sparkproject.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:401)
    at org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:357)
    at org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:181)
    at org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
    at org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
    at org.sparkproject.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
    at org.sparkproject.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:698)
    at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:804)
    at java.lang.Thread.run(Thread.java:748)

"org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner" #11 daemon prio=5 os_prio=0 tid=0x00007f96b48d0000 nid=0x4023 in Object.wait() [0x00007f969d71a000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000000c0006b00> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
    - locked <0x00000000c0006b00> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
    at org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3760)
    at java.lang.Thread.run(Thread.java:748)

"Service Thread" #7 daemon prio=9 os_prio=0 tid=0x00007f96b4134800 nid=0x401c runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C1 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007f96b412a000 nid=0x401b waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007f96b4126800 nid=0x401a waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007f96b4119800 nid=0x4019 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f96b40ec800 nid=0x4018 in Object.wait() [0x00007f969fdfc000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000000c0005050> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
    - locked <0x00000000c0005050> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:216)

"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f96b40e8000 nid=0x4017 in Object.wait() [0x00007f969fefd000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000000c001e610> (a java.lang.ref.Reference$Lock)
    at java.lang.Object.wait(Object.java:502)
    at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
    - locked <0x00000000c001e610> (a java.lang.ref.Reference$Lock)
    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)

"main" #1 prio=5 os_prio=0 tid=0x00007f96b4058000 nid=0x4013 waiting on condition [0x00007f96be172000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:308)
    at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala)

"VM Thread" os_prio=0 tid=0x00007f96b40de000 nid=0x4016 runnable 

"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f96b406b000 nid=0x4014 runnable 

"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f96b406d000 nid=0x4015 runnable 

"VM Periodic Task Thread" os_prio=0 tid=0x00007f96b4137000 nid=0x401d waiting on condition 

JNI global references: 1384

我在spark dev jira上提出了下面的问题,但是我想在这里检查一下是否有其他人看到过这个问题,并且可能已经找到了解决方法。任何帮助都将不胜感激
https://issues.apache.org/jira/browse/spark-33470

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题