创建hadoop序列文件

kpbwa7wx  于 2021-06-01  发布在  Hadoop
关注(0)|答案(1)|浏览(408)

我正在尝试创建hadoop序列文件。
我成功地在hdfs中创建了一个序列文件,但是如果我尝试读取一个序列文件,就会出现“sequence file not a sequencefile”错误。我还检查hdfs中创建的序列文件。

这是我的源代码,可以读写序列文件到hdfs。

package us.qi.hdfs;

import java.io.IOException;
import java.net.URI;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.ArrayFile;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.SequenceFile;
import org.apache.hadoop.io.Text;

public class SequenceFileText {
    public static void main(String args[]) throws IOException {

        /**Get Hadoop HDFS command and Hadoop Configuration*/
        HDFS_Configuration conf = new HDFS_Configuration();
        HDFS_Test hdfs = new HDFS_Test();

        String uri = "hdfs://slave02:9000/user/hadoop/test.seq";

        /**Get Configuration from HDFS_Configuration Object by using get_conf()*/
        Configuration config = conf.get_conf();

        SequenceFile.Writer writer = null;
        SequenceFile.Reader reader = null;

        try {
            Path path = new Path(uri);

            IntWritable key = new IntWritable();
            Text value = new Text();

            writer = SequenceFile.createWriter(config, SequenceFile.Writer.file(path), SequenceFile.Writer.keyClass(key.getClass()),
                    ArrayFile.Writer.valueClass(value.getClass()));
            reader = new SequenceFile.Reader(config, SequenceFile.Reader.file(path));

            writer.append(new IntWritable(11), new Text("test"));
            writer.append(new IntWritable(12), new Text("test2"));
            writer.close();

            while (reader.next(key, value)) {
                System.out.println(key + "\t" + value);
            }
            reader.close();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            IOUtils.closeStream(writer);
            IOUtils.closeStream(reader);
        }
    }
}

这个错误是不会发生的。
2018-09-17 17:15:34267 warn[main]util.nativecodeloader(nativecodeloader.java:(62))-无法为您的平台加载本机hadoop库。。。在适用的情况下使用内置java类2018-09-17 17:15:38,870 info[main]compress.codepool(编解码器池。java:getcompressor(153))-获得全新压缩机[.deflate]java.io.eofexception:hdfs://slave02:9000/user/hadoop/test.seq不是org.apache.hadoop.io.sequencefile$reader.init(sequencefile)中的sequencefile。java:1933)在org.apache.hadoop.io.sequencefile$reader.initialize(sequencefile。java:1892)在org.apache.hadoop.io.sequencefile$reader.(序列文件。java:1841)at us.qi.hdfs.sequencefiletext.main(sequencefiletext。java:36)

5kgi1eie

5kgi1eie1#

那是我的错。我修改了一些源代码。
首先,我检查文件是否已经存在于hdfs中。如果没有文件,我就创建一个writer对象。
当writer进程完成时,我检查一个序列文件。检查完文件后,我成功地读取了一个序列文件。
这是我的密码。谢谢!

try {
            Path path = new Path(uri);

            IntWritable key = new IntWritable();
            Text value = new Text();

            /**First, Check a file already exists.
             * If there is not exists in hdfs, writer object is created.
             * */
            if (!fs.exists(path)) {
                writer = SequenceFile.createWriter(config, SequenceFile.Writer.file(path), SequenceFile.Writer.keyClass(key.getClass()),
                        ArrayFile.Writer.valueClass(value.getClass()));

                writer.append(new IntWritable(11), new Text("test"));
                writer.append(new IntWritable(12), new Text("test2"));
                writer.close();
            } else {
                logger.info(path + " already exists.");
            }

            /**Create a SequenceFile Reader object.*/
            reader = new SequenceFile.Reader(config, SequenceFile.Reader.file(path));

            while (reader.next(key, value)) {
                System.out.println(key + "\t" + value);
            }

            reader.close();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            IOUtils.closeStream(writer);
            IOUtils.closeStream(reader);
        }

相关问题