我使用hadoop在位于远程主机的hdf之间复制文件。我的问题是,这些主机之间的网络有很高的延迟(>1秒),有时hadoop会启动一个错误 java.net.NoRouteToHostException: No route to host;
.
我认为这个问题的发生是因为延迟。可以使用ping访问主机,但有一点延迟。这里有一个ping的例子。一开始它无法到达目标主机,但后来它做到了。
WorkGroup4-0:~# ping WorkGroup1-4ping: unknown host WorkGroup1-4
WorkGroup4-0:~# ping WorkGroup1-1
PING WorkGroup1-1 (172.16.100.2) 56(84) bytes of data.
From WorkGroup4-0 (172.16.100.13) icmp_seq=1 Destination Host Unreachable
From WorkGroup4-0 (172.16.100.13) icmp_seq=2 Destination Host Unreachable
From WorkGroup4-0 (172.16.100.13) icmp_seq=3 Destination Host Unreachable
From WorkGroup4-0 (172.16.100.13) icmp_seq=4 Destination Host Unreachable
From WorkGroup4-0 (172.16.100.13) icmp_seq=5 Destination Host Unreachable
From WorkGroup4-0 (172.16.100.13) icmp_seq=6 Destination Host Unreachable
From WorkGroup4-0 (172.16.100.13) icmp_seq=7 Destination Host Unreachable
From WorkGroup4-0 (172.16.100.13) icmp_seq=8 Destination Host Unreachable
From WorkGroup4-0 (172.16.100.13) icmp_seq=9 Destination Host Unreachable
64 bytes from WorkGroup1-1 (172.16.100.2): icmp_req=12 ttl=64 time=1036 ms
64 bytes from WorkGroup1-1 (172.16.100.2): icmp_req=15 ttl=64 time=996 ms
^C
--- WorkGroup1-1 ping statistics ---
24 packets transmitted, 2 received, +9 errors, 91% packet loss, time 23134ms
rtt min/avg/max/mdev = 996.201/1016.462/1036.724/20.286 ms, pipe 3
有没有办法为高延迟的网络配置jvm,以便尝试连接到远程主机的时间更长?
1条答案
按热度按时间cunj1qz11#
真是一团糟。。。但好吧,这里有一个简短的测试清单:
dfs.client.failover.connection.retries.on.timeouts,默认值为0,介于2和5之间
dfs.client.failover.connection.retries,默认值为0,介于2和5之间
dfs.client.failover.max.attempts,默认值15,大于15,小于50
如果hadoop集群中也存在延迟,请考虑rack awarness特性并在每个节点上分配一个惟一的rackid,这将告诉hadoop所有节点都是远程的。
更多信息请参见:http://hadoop.apache.org/docs/r2.2.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml