我正在尝试一些简单的java代码,将文件从本地系统复制到hdfs。我的简单方法是这样的:
private static void copyFileToHDFS() throws IOException
{
config.set("fs.defaultFS","hdfs://127.0.0.1:9000");
FileSystem hdfs = FileSystem.get(config);
Path localfsSourceDir = new Path("D:\\file1");
Path hdfsTargetDir = new Path ("hdfs://127.0.0.1:9000/dir/");
hdfs.copyFromLocalFile(localfsSourceDir, hdfsTargetDir); //throws Exception
}
最后一行出现以下异常:
Exception in thread "main" java.io.IOException: Failed on local exception: java.io.IOException: An established connection was aborted by the software in your host machine; Host Details : local host is: "01hw713648/10.163.5.139"; destination host is: "127.0.0.1":9000;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:773)
at org.apache.hadoop.ipc.Client.call(Client.java:1479)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at $Proxy9.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at $Proxy10.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2108)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424)
at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:496)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:348)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1965)
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1933)
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1898)
at HBaseImportTsvBulkLoader.copyFileToHDFS(HBaseImportTsvBulkLoader.java:64)
at HBaseImportTsvBulkLoader.main(HBaseImportTsvBulkLoader.java:37)
Caused by: java.io.IOException: An established connection was aborted by the software in your host machine
at sun.nio.ch.SocketDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(Unknown Source)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
at sun.nio.ch.IOUtil.read(Unknown Source)
at sun.nio.ch.SocketChannelImpl.read(Unknown Source)
at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:57)
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at java.io.FilterInputStream.read(Unknown Source)
at java.io.FilterInputStream.read(Unknown Source)
at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:520)
at java.io.BufferedInputStream.fill(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at java.io.DataInputStream.readInt(Unknown Source)
at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1084)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:979)
我的设置
我在运行virtualbox的ubuntuvm中运行hadoop集群(virtualbox在windows上运行)。我的群集运行得很好。我正在windows上运行这个java代码。我在虚拟盒上设置了如下端口转发规则:
Name Protocol Host-ip Host-port Guest-Ip Guest-port
datanode tcp <left empty> 50075 <guest-ip> 50075
dfs web ui tcp <left empty> 50070 <guest-ip> 50070
mapred apps tcp <left empty> 8088 <guest-ip> 8088
hbase web ui tcp <left empty> 16010 <guest-ip> 16010
hdfs tcp <left empty> 9000 <guest-ip> 9000
regionserver web ui tcp <left empty> 16301 <guest-ip> 16301
ssh tcp <left empty> 22 <guest-ip> 22
它允许我连接到vm上的各种服务:
我能够连接到虚拟机使用腻子
还可以在windows内的浏览器中打开各种hadoopwebui:namenodewebui、hmasterwebui、regionserver webui
更新
针对类似错误的一些在线解决方案建议重新启动eclipse,但在hadoop上下文中没有。所以我试过了。现在得到稍微不同的错误:
Exception in thread "main" java.io.IOException: Failed on local exception: java.io.IOException: An existing connection was forcibly closed by the remote host; Host Details : local host is: "01hw713648/10.163.5.139"; destination host is: "127.0.0.1":9000;
具有完全相同的堆栈跟踪。
2条答案
按热度按时间zu0ti5jz1#
解决了这个问题。我指定了
localhost
在所有hadoop xyz-site.xml文件中。我把它们都改了<guest-vm-ip>
.jtjikinw2#
最近,当指定的端口出错时,我正好遇到了这个错误。我指定了50070而不是9000。
我使用了只支持主机的适配器,因此不需要端口转发。我从windows主机连接到linux虚拟机中的192.168.x.x。我还确保name节点监听core-site.xml中的0.0.0.0:9000。在vm中禁用了防火墙和se linux。我的设置正在运行。