如何使用premhadoop集群访问s3文件?

1sbrub3j  于 2021-05-27  发布在  Hadoop
关注(0)|答案(2)|浏览(452)

我有一个cloudera虚拟机,能够设置aws cli和设置密钥。但是,我不能使用hadoop fs-ls s3://gft ri或任何hadoop命令读取s3文件或访问s3文件。我可以使用awscli查看目录/文件。
命令快照:

(base) [cloudera@quickstart conf]$**aws s3 ls s3://gft-risk-aml-market-dev/**
                           PRE test/
2019-11-27 04:11:26        458 required

(base) [cloudera@quickstart conf]$**hdfs dfs -ls s3://gft-risk-aml-market-dev/**
19/11/27 05:30:45 WARN fs.FileSystem: S3FileSystem is deprecated and will be removed in future releases. Use NativeS3FileSystem or S3AFileSystem instead.
ls: `s3://gft-risk-aml-market-dev/': No such file or directory

我已经把core-site.xml的细节放进去了。

<property>
    <name>fs.s3.impl</name>
    <value>org.apache.hadoop.fs.s3.S3FileSystem</value>
  </property>

  <property>
    <name>fs.s3.awsAccessKeyId</name>
    <value>ANHS</value>
  </property>

  <property>
    <name>fs.s3.awsSecretAccessKey</name>
    <value>EOo</value>
  </property>

   <property>
     <name>fs.s3.path.style.access</name>
     <value>true</value>
    </property>

   <property>
    <name>fs.s3.endpoint</name>
    <value>s3.us-east-1.amazonaws.com</value>
  </property>

     <property>
        <name>fs.s3.connection.ssl.enabled</name>
        <value>false</value>
    </property>
gk7wooem

gk7wooem1#

最后。cloudera quickstart v13及以下core-site.xml工作正常。

<property>
    <name>fs.s3a.impl</name>
    <value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
  </property>

  <property>
    <name>fs.s3a.awsAccessKeyId</name>
    <value>AKIAxxxx</value>
  </property>

  <property>
    <name>fs.s3a.awsSecretAccessKey</name>
    <value>Xxxxxx</value>
  </property>

   <property>
     <name>fs.s3a.path.style.access</name>
     <value>true</value>
    </property>

<property>
  <name>fs.AbstractFileSystem.s3a.impl</name>
  <value>org.apache.hadoop.fs.s3a.S3A</value>
  <description>The implementation class of the S3A AbstractFileSystem.</description>
</property>

   <property>
    <name>fs.s3a.endpoint</name>
    <value>s3.us-east-1.amazonaws.com</value>
  </property>

     <property>
        <name>fs.s3a.connection.ssl.enabled</name>
        <value>false</value>
    </property>

<property>
  <name>fs.s3a.readahead.range</name>
  <value>64K</value>
  <description>Bytes to read ahead during a seek() before closing and
  re-opening the S3 HTTP connection. This option will be overridden if
  any call to setReadahead() is made to an open stream.</description>
</property>

<property>
  <name>fs.s3a.list.version</name>
  <value>2</value>
  <description>Select which version of the S3 SDK's List Objects API to use.
  Currently support 2 (default) and 1 (older API).</description>
</property>
v8wbuo2f

v8wbuo2f2#

我将使用linux控制台挂载s3 bucket,然后以这种方式将文件从那里移动到hdfs。您可能需要在cloudera快速启动上安装它,首先sudo'ing到root,例如sudo yum install s3fs fuse

相关问题