cloudera管理器健康问题:namenode连接、web服务器状态

ojsjcaue  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(531)

下面是cm报告的健康问题的快照。列表中的数据节点不断变化。datanode日志中的一些错误:

3:59:31.859 PM  ERROR   org.apache.hadoop.hdfs.server.datanode.DataNode 
    datanode05.hadoop.com:50010:DataXceiver error processing WRITE_BLOCK operation  src: /10.248.200.113:45252 dest: /10.248.200.105:50010
    java.io.IOException: Premature EOF from inputStream
        at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:194)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:414)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:635)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:564)
        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:103)
        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:67)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
        at java.lang.Thread.run(Thread.java:662)
5:46:03.606 PM  INFO    org.apache.hadoop.hdfs.server.datanode.DataNode 
    Exception for BP-846315089-10.248.200.4-1369774276029:blk_-780307518048042460_200374997
    java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.248.200.105:50010 remote=/10.248.200.122:43572]
        at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:165)
        at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:156)
        at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:129)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
        at java.io.DataInputStream.read(DataInputStream.java:132)
        at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:414)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:635)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:564)
        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:103)
        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:67)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
        at java.lang.Thread.run(Thread.java:662)

快照:

我无法找出问题的根本原因。我可以手动连接从一个数据节点到另一个没有问题,我不相信这是一个网络问题。此外,丢失的块和复制不足的块计数也会更改(向上和向下)。
cloudera管理器:cloudera标准4.8.1
川东北4.7
任何帮助解决这个问题都是非常感谢的。
更新:2016年1月1日
对于列为坏的datanodes,当我看到dadanodes日志时,我经常看到这个消息。。。

11:58:30.066 AM INFO    org.apache.hadoop.hdfs.server.datanode.DataNode 
Receiving BP-846315089-10.248.200.4-1369774276029:blk_-706861374092956879_36606459 src: /10.248.200.123:56795 dest: /10.248.200.112:50010

为什么这个datanode在同一时间从其他datanode接收很多块?似乎由于此活动,datanode无法及时响应namenode请求,因此超时。所有坏数据节点都显示相同的模式。

2hh7jdfx

2hh7jdfx1#

类似的问题也得到了回答
hdfs数据节点与namenode断开连接。
请检查防火墙。使用

telnet ipaddress port

检查连接。

相关问题