apache hadoop hiveserver2在hadoop集群中频繁关闭

toe95027  于 2021-05-31  发布在  Hadoop
关注(0)|答案(0)|浏览(534)

我们有一个hadoop集群(v2.9.2,100个节点,ubuntu18,)和一个hiveserver2集群(v2.3.3,10个节点,ubuntu18,)一起运行,最近我们注意到hive服务时不时地会自行关闭。我不知道它是什么时候开始的,也不知道是否有一段时间没有发生这种情况,因为我们的系统配置为每半小时运行一次厨师,他们负责启动服务。
systemctl出错:

● hive-server2.service - Apache Hadoop hiveserver2
   Loaded: loaded (/lib/systemd/system/hive-server2.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Sun 2020-08-02 06:39:34 EDT; 1h 14min ago
  Process: 68691 ExecStart=/opt/hive/apache-hive-2.3.3-bin/bin/start-hiveserver2.sh (code=exited, status=127)
 Main PID: 68691 (code=exited, status=127)

Aug 02 06:39:33 hive13 java[68691]: pam_unix(sshd:auth): authentication failure; logname= uid=70003 euid=70003 tty= ruser= rhost=  user=hdp13
Aug 02 06:39:33 hive13 java[68691]: pam_unix(login:auth): authentication failure; logname= uid=70003 euid=70003 tty= ruser= rhost=  user=hdp13
Aug 02 06:39:33 hive13 start-hiveserver2.sh[68691]: OK
Aug 02 06:39:33 hive13 java[68691]: pam_unix(sshd:auth): authentication failure; logname= uid=70003 euid=70003 tty= ruser= rhost=  user=hdp13
Aug 02 06:39:33 hive13 java[68691]: pam_unix(login:auth): authentication failure; logname= uid=70003 euid=70003 tty= ruser= rhost=  user=hdp13
Aug 02 06:39:33 hive13 start-hiveserver2.sh[68691]: OK
Aug 02 06:39:33 hive13 java[68691]: pam_unix(sshd:auth): authentication failure; logname= uid=70003 euid=70003 tty= ruser= rhost=  user=hdp13
Aug 02 06:39:33 hive13 start-hiveserver2.sh[68691]: Inconsistency detected by ld.so: ../elf/dl-tls.c: 481: _dl_allocate_tls_init: Assertion `listp->slotinfo[cnt].ge
Aug 02 06:39:34 hive13 systemd[1]: hive-server2.service: Main process exited, code=exited, status=127/n/a
Aug 02 06:39:34 hive13 systemd[1]: hive-server2.service: Failed with result 'exit-code'.```

And in /var/log/syslog:
```<30>Aug  2 06:39:33 hive13 start-hiveserver2.sh[68691]: Inconsistency detected by ld.so: ../elf/dl-tls.c: 481: _dl_allocate_tls_init: Assertion `listp->slotinfo[cnt].gen <= GL(dl_tls_generation)' failed!
<29>Aug  2 06:39:34 hive13 systemd[1]: hive-server2.service: Main process exited, code=exited, status=127/n/a
<28>Aug  2 06:39:34 hive13 systemd[1]: hive-server2.service: Failed with result 'exit-code'.
<30>Aug  2 06:41:27 hive13 systemd[1]: Starting Cleanup of Temporary Directories...
<30>Aug  2 06:41:27 hive13 systemd[1]: Starting Daily apt upgrade and clean activities...
<30>Aug  2 06:41:27 hive13 systemd[1]: Started Cleanup of Temporary Directories.
<30>Aug  2 06:41:28 hive13 systemd[1]: Started Daily apt upgrade and clean activities.
<30>Aug  2 06:50:21 hive13 start-metastore.sh[139225]: 2020-08-02T06:50:21.905-0400: 1015612.169: [GC (Allocation Failure) 2020-08-02T06:50:21.905-0400: 1015612.169: [ParNew: 1245408K->10744K(1380160K), 0.0546688 secs] 1369822K->135643K(25012480K), 0.0549767 secs] [Times: user=0.81 sys=0.00, real=0.06 secs]
<30>Aug  2 06:52:35 hive13 start-metastore.sh[139225]: 2020-08-02T06:52:35.896-0400: 1015746.160: [GC (Allocation Failure) 2020-08-02T06:52:35.896-0400: 1015746.160: [ParNew: 1237560K->4893K(1380160K), 0.0484039 secs] 1362459K->129793K(25012480K), 0.0487111 secs] [Times: user=0.71 sys=0.00, real=0.04 secs]
<26>Aug  2 06:56:11 hive13 smartd[17646]: Device: /dev/sdb [SAT], FAILED SMART self-check. BACK UP DATA NOW!
<26>Aug  2 06:56:11 hive13 smartd[17646]: Device: /dev/sdb [SAT], 1 Currently unreadable (pending) sectors
<26>Aug  2 06:56:11 hive13 smartd[17646]: Device: /dev/sdb [SAT], Failed SMART usage Attribute: 5 Reallocated_Sector_Ct.
<30>Aug  2 07:03:57 hive13 dbus-daemon[1120]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.24561' (uid=0 pid=3417 comm="/usr/bin/hostnamectl " label="unconfined")```

Please advice.

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题