jvm崩溃,没有指定帧,只有“计时器过期,中止”

ar7v8xwq  于 2021-06-04  发布在  Hadoop
关注(0)|答案(2)|浏览(501)

我正在hadoop下运行一个java作业,它正在破坏jvm。我怀疑这是由于一些jni代码(它使用带有多线程本机blas实现的jblas)。但是,虽然我希望崩溃日志为调试提供“有问题的框架”,但是日志看起来像:


# 

# A fatal error has been detected by the Java Runtime Environment:

# 

# SIGSEGV (0xb) at pc=0x00007f204dd6fb27, pid=19570, tid=139776470402816

# 

# JRE version: 6.0_38-b05

# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.13-b02 mixed mode linux-amd64 compressed oops)

# Problematic frame:

# # [ timer expired, abort... ]

jvm在产生这个崩溃转储输出时是否有一些计时器,它将等待多长时间?如果是这样的话,有没有办法增加时间,这样我就能得到更多有用的信息?我不认为引用的计时器来自hadoop,因为我在许多地方看到(没有帮助的)对这个错误的引用,而这些地方没有提到hadoop。
google似乎显示字符串“timer expired,abort”只出现在这些jvm错误消息中,因此它不太可能来自操作系统。
编辑:看来我可能走运了。从 ./hotspot/src/share/vm/runtime/thread.cpp 在jvm源代码的openjdk版本中:

if (is_error_reported()) {
   // A fatal error has happened, the error handler(VMError::report_and_die)
   // should abort JVM after creating an error log file. However in some
   // rare cases, the error handler itself might deadlock. Here we try to
   // kill JVM if the fatal error handler fails to abort in 2 minutes.
   //
   // This code is in WatcherThread because WatcherThread wakes up
   // periodically so the fatal error handler doesn't need to do anything;
   // also because the WatcherThread is less likely to crash than other
   // threads.

   for (;;) {
     if (!ShowMessageBoxOnError
      && (OnError == NULL || OnError[0] == '\0')
      && Arguments::abort_hook() == NULL) {
          os::sleep(this, 2 * 60 * 1000, false);
          fdStream err(defaultStream::output_fd());
          err.print_raw_cr("# [ timer expired, abort... ]");
          // skip atexit/vm_exit/vm_abort hooks
          os::die();
     }

     // Wake up 5 seconds later, the fatal handler may reset OnError or
     // ShowMessageBoxOnError when it is ready to abort.
     os::sleep(this, 5 * 1000, false);
   }
 }

它似乎被硬编码为等待两分钟。我不知道为什么我的工作要花更长的时间做事故报告,但我认为这个问题至少已经得到了回答。

58wvjzkj

58wvjzkj1#

解决这个问题的方法是 -XX:ShowMessageBoxOnError 并使用另一术语中的调试器附加到进程。

vawmfj5a

vawmfj5a2#

看来我可能走运了。来自jvm源代码的openjdk版本中的./hotspot/src/share/vm/runtime/thread.cpp:

if (is_error_reported()) {
   // A fatal error has happened, the error handler(VMError::report_and_die)
   // should abort JVM after creating an error log file. However in some
   // rare cases, the error handler itself might deadlock. Here we try to
   // kill JVM if the fatal error handler fails to abort in 2 minutes.
   //
   // This code is in WatcherThread because WatcherThread wakes up
   // periodically so the fatal error handler doesn't need to do anything;
   // also because the WatcherThread is less likely to crash than other
   // threads.

   for (;;) {
     if (!ShowMessageBoxOnError
      && (OnError == NULL || OnError[0] == '\0')
      && Arguments::abort_hook() == NULL) {
          os::sleep(this, 2 * 60 * 1000, false);
          fdStream err(defaultStream::output_fd());
          err.print_raw_cr("# [ timer expired, abort... ]");
          // skip atexit/vm_exit/vm_abort hooks
          os::die();
     }

     // Wake up 5 seconds later, the fatal handler may reset OnError or
     // ShowMessageBoxOnError when it is ready to abort.
     os::sleep(this, 5 * 1000, false);
   }
 }

它似乎被硬编码为等待两分钟。我不知道为什么我的工作要花更长的时间做事故报告,但我认为这个问题至少已经得到了回答。

相关问题