1) 我们正在尝试使用s3distcp jar(http://docs.aws.amazon.com/elasticmapreduce/latest/developerguide/usingemr_s3distcp.html#emr-s3distcp verisons),用于将hdfs文件从aws china hadoop master示例复制到aws china s3 bucket。
2) 我们正在执行来自aws中国hadoop大师的命令
hadoop jar /usr/share/aws/emr/s3-dist-cp/lib/s3-dist-cp.jar --src hdfs://${HDFS_DIR} --dest s3n://${S3_BUCKETNAME}/${Folder_Name}/ --s3Endpoint=s3.cn-north-1.amazonaws.com.cn
3) 当我们运行这个“s3distcp”命令时,抛出以下异常
16/02/22 08:39:52 INFO s3distcp.S3DistCp: Using output path 'hdfs:/tmp/f6a864f8-d70d-426f-b05f-08f7d0097fd9/output'
Exception in thread "main" java.lang.NoClassDefFoundError: com/google/gson/internal/Pair
at com.amazon.elasticmapreduce.s3distcp.S3DistCp.getSrcPrefixes(S3DistCp.java:468)
at com.amazon.elasticmapreduce.s3distcp.S3DistCp.createInputFileList(S3DistCp.java:521)
at com.amazon.elasticmapreduce.s3distcp.S3DistCp.run(S3DistCp.java:850)
at com.amazon.elasticmapreduce.s3distcp.S3DistCp.run(S3DistCp.java:720)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at com.amazon.elasticmapreduce.s3distcp.Main.main(Main.java:22)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.ClassNotFoundException: com.google.gson.internal.Pair
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 13 more
4) 另外,您是否可以让我们知道,如果有任何其他替代方案,然后使用“s3 dist cp”从aws中国hadoop主示例复制hdfs文件到aws中国s3 bucket?
谢谢和问候,
阿密特
1条答案
按热度按时间2vuwiymt1#
似乎我们创建的aws中国hadoop emr集群出现了一些问题。我们已经创建了新的aws中国hadoop emr集群,并且使用“s3distcp”命令,我们现在可以从aws中国hadoop master上传hdfs文件到aws中国s3 bucket
//s3 dist cp命令示例
s3 dist cp--src=hdfs:///hdfs\u folder\u name/--dest=s3n://my bucket/folder