它工作得很好,不知道发生了什么变化,突然出现了这个错误。
hdfs在docker集群上运行(1个rm+2个节点)。
在容器中工作良好,因此数据节点没有问题。
使用代码或hdfs命令将文件从主机复制到hdfs时引发。
错误堆栈来自hadoop-root-namenode-master.log。
2017-10-26 09:18:05,102 WARN org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough replicas: expected size is 2 but only 0 storage types can be selected (replication=2, selected=[], unavailable=[DISK], removed=[DISK, DISK], policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
2017-10-26 09:18:05,102 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 2 to reach 2 (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) All required storage types are unavailable: unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
2017-10-26 09:18:05,103 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 192.168.1.6:52435 Call#7 Retry#0
java.io.IOException: File /spark/apps/spark-app.jar could only be replicated to 0 nodes instead of minReplication (=1). There are 2 datanode(s) running and 2 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1571)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3107)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3031)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:725)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
源代码是:
System.setProperty("HADOOP_USER_NAME", "root");
Configuration conf = new Configuration();
conf.addResource(new Path("src/main/resources/core-site.xml"));
FileSystem fs = FileSystem.get(conf);
fs.copyFromLocalFile(new Path(jars/spark-app.jar"), new Path("/spark/apps/spark-app.jar"));
1条答案
按热度按时间3pmvbmvn1#
我花了很多时间浏览,但是在我将log4j级别改为debug之后,我很快找到了问题所在。日志显示客户端正在连接“172.20.0.3”,这是datanode的ip,无法从主机访问。然后将路线添加到172.20/16(
route add -p ...
),现在可以正常工作了。