hdfs节点文件太多？

c2e8gylq 于 2021-06-02 发布在 Hadoop

关注(0)|答案(1)|浏览(792)

我们有一个有五个节点的hdfs集群。在将新文件写入文件系统时，经常会出现“副本不足”错误或以下错误：

2016-05-29 13:30:03,972 [Thread-486536] INFO  org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream
java.io.IOException: Got error, status message , ack with firstBadLink as 10.100.1.22:50010
at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:142) ~[hadoop-hdfs-2.7.1.jar!/:na]
...
2016-05-29 13:30:03,972 [Thread-486536] INFO  org.apache.hadoop.hdfs.DFSClient - Abandoning BP-1195099512-10.100.1.21-1454597789659:blk_1085523876_11792285
2016-05-29 13:30:03,977 [Thread-486536] INFO  org.apache.hadoop.hdfs.DFSClient - Excluding datanode DatanodeInfoWithStorage[10.100.1.22:50010,DS-2f34af8d-234a-4036-a810-908c3b2bd9cf,DISK]
2016-05-29 13:30:04,003 [pool-1272-thread-3] WARN  org.apache.hadoop.hdfs.DFSClient - Slow waitForAckedSeqno took 65098ms (threshold=30000ms)

我们也经历了很多这样的情况，这似乎是当大gc'ing发生的时候。

[pool-9-thread-23] WARN  org.apache.hadoop.hdfs.DFSClient - Slow waitForAckedSeqno took 34607ms (threshold=30000ms)
 [pool-9-thread-30] WARN  org.apache.hadoop.hdfs.DFSClient - Slow waitForAckedSeqno took 34339ms (threshold=30000ms)
 [pool-9-thread-5] WARN  org.apache.hadoop.hdfs.DFSClient - Slow waitForAckedSeqno took 34593ms (threshold=30000ms)

文件系统包含650万个小文件（4-20kb），当我们编写新文件时，节点会随着oom的下降而下降。新文件总是成批编写的，一个批可以是几十万个。
这些节点目前有大量的ram，名称节点为4gb，数据节点为3gb。
这真的是预期的行为吗？为什么节点消耗了这么多内存？
我想增加节点的数量，看看是否可以使用更严格的mem设置，比如1024mb。可能吗？
编辑：我们看到很多gc发生，当gc发生时节点没有响应。

hadoop hdfs

来源：https://stackoverflow.com/questions/37522365/hdfs-nodes-oom-too-many-files