我有一个spark应用程序从hdfs获取数据并将数据摄取到s3中。下面是我正在使用的不同组件的版本。
spark:2.3.1 hadoop:2.7.3 scala:2.11.8
我使用的是hadoop-aws-2.7.3.jar、hadoop-common-2.7.3.jar和aws-java-sdk-1.7.4.jar。我关注了一些与hadoop相关的博客,还参考了mavenrepository站点以获得正确的jar组合。
这是我上传文件到s3的代码
spark.sparkContext.hadoopConfiguration.set("fs.s3a.access.key", "<access_key>")
spark.sparkContext.hadoopConfiguration.set("fs.s3a.secret.key", "")
spark.sparkContext.hadoopConfiguration.set("fs.s3a.impl",
"org.apache.hadoop.fs.s3a.S3AFileSystem")
spark.sparkContext.hadoopConfiguration.set("fs.s3a.endpoint", "<access_endpoint>")
spark.sparkContext.hadoopConfiguration.set("fs.s3a.path.style.access", "true")
val wikipediaDataitems = spark.read.json("<some_json_file_in_hdfs>")
wikipediaDataitems.write.format("json").save("s3a://<bucket_name>/wikipedia.json")
下面是我得到的错误
Caused by: java.lang.IllegalAccessError: tried to access method
org.apache.hadoop.metrics2.lib.MutableCounterLong.<init>.
(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class
org.apache.hadoop.fs.s3a.S3AInstrumentation
at
org.apache.hadoop.fs.s3a.S3AInstrumentation.streamCounter(S3AInstrumentation.java:163)
at org.apache.hadoop.fs.s3a.S3AInstrumentation.streamCounter(S3AInstrumentation.java:185)
at org.apache.hadoop.fs.s3a.S3AInstrumentation.<init>(S3AInstrumentation.java:112)
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:146)
我确实遇到过很多stackoverflow问题,他们也遇到过同样的问题,并且尝试了hadoop aws和hadop common以及aws sdk jars的不同组合,到目前为止都不走运。
迄今为止尝试的组合,也提到了每个组合的相关错误:
hadoop-aws-2.7.3.jar、hadoop-common-2.7.3.jar、aws-java-sdk-1.10.6.jar
org.apache.spark.sql.execution.datasources.DataSource.planForWritingFileFormat(DataSource.scala:452)
org.apache.spark.sql.execution.datasources.DataSource.planForWriting(DataSource.scala:548)
org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:278)
org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:267)
org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:225)
... 49 elided
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found
hadoop-aws-2.8.2.jar、hadoop-common-2.8.2.jar、aws-java-sdk-1.10.6.jar
java.lang.NoClassDefFoundError: com/amazonaws/AmazonClientException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2134)
hadoop-aws-2.7.3.jar、hadoop-common-2.7.3.jar、aws-java-sdk-1.11.123.jar的原因:java.lang.classnotfoundexception:com.amazonaws.event.progresslistener at java.net.urlclassloader.findclass(urlclassloader)。java:381)在java.lang.classloader.loadclass(classloader。java:424)在java.lang.classloader.loadclass(classloader。java:357) ... 66个以上
hadoop-aws-2.7.7.jar、hadoop-aws-2.7.7.jar和aws-java-sdk-1.7.4.jar
Caused by: java.lang.IllegalAccessError: tried to access method
org.apache.hadoop.metrics2.lib.MutableCounterLong.<init>.
(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class
org.apache.hadoop.fs.s3a.S3AInstrumentation
at org.apache.hadoop.fs.s3a.S3AInstrumentation.streamCounter(S3AInstrumentation.java:163)
at org.apache.hadoop.fs.s3a.S3AInstrumentation.streamCounter(S3AInstrumentation.java:185)
at org.apache.hadoop.fs.s3a.S3AInstrumentation.<init>(S3AInstrumentation.java:112)
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:146)
有人能帮我吗
暂无答案!
目前还没有任何答案,快来回答吧!