我使用了以下依赖项:
<properties>
<javac-source.version>1.8</javac-source.version>
<javac-target.version>1.8</javac-target.version>
<hadoop.version>3.2.1</hadoop.version>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
<version>1.8.2</version>
</dependency>
<!-- Thanks for using https://jar-download.com -->
<dependency>
<groupId>org.apache.parquet</groupId>
<artifactId>parquet-hadoop</artifactId>
<version>1.8.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.parquet/parquet-avro -->
<dependency>
<groupId>org.apache.parquet</groupId>
<artifactId>parquet-avro</artifactId>
<version>1.8.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-azure-datalake</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-azure</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-aws</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-bundle</artifactId>
<version>1.11.762</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>1.2.1</version>
</dependency>
</dependencies>
设置以下配置:
Configuration conf = new Configuration();
conf.set("fs.s3.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem");
conf.set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem");
conf.set("fs.s3a.access.key","key_id");
conf.set("fs.s3a.secret.key","key");
conf.set("fs.s3a.endpoint","s3.us-east-1.amazonaws.com");
conf.set("fs.s3a.aws.credentials.provider",org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider.NAME);
conf.set("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem.class.getName());
使用以下路径:
new Path("s3a://bucket_name/" + filename)
我也试着 HADOOP_HOME
,它没有帮助
我总是得到以下错误:
20/04/15 17:55:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/04/15 17:55:13 WARN impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
20/04/15 17:55:13 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
20/04/15 17:55:13 INFO impl.MetricsSystemImpl: s3a-file-system metrics system started
Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:645)
at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:1230)
at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:160)
at org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:100)
at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:77)
at org.apache.hadoop.util.BasicDiskValidator.checkStatus(BasicDiskValidator.java:32)
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:331)
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:394)
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createTmpFileForWrite(LocalDirAllocator.java:477)
at org.apache.hadoop.fs.LocalDirAllocator.createTmpFileForWrite(LocalDirAllocator.java:213)
at org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:589)
at org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:811)
at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.createBlockIfNeeded(S3ABlockOutputStream.java:190)
at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.<init>(S3ABlockOutputStream.java:168)
at org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:822)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1118)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1098)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:987)
at org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:223)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:266)
at org.apache.parquet.hadoop.ParquetWriter$Builder.build(ParquetWriter.java:489)
at Main.main(Main.java:43)
在2.8.2版本的hadoop上使用了与s3n相同的凭证。但是know for 3.2.1不推荐使用s3n。
1条答案
按热度按时间pb3skfrl1#
当尝试使用hadoop的
s3a
本地文件系统用于创建临时文件。为了能够使用底层文件系统,必须添加基于操作系统的本机支持。这个例外说明了同样的情况,
这是windows操作系统,您需要
winutils
.下载
winutils.exe
二进制文件和其他需要的文件。设置环境变量
%HADOOP_HOME%
指向安装这些二进制文件的目录。阅读more:hadoop2 - windows问题