在hadoop mapreduce中使用gzip编解码器压缩reducer的输出时出错

z0qdvdin 于 2021-06-03 发布在 Hadoop

关注(0)|答案(2)|浏览(311)

我没有粘贴下面的输入、输出、Map器和缩减器类。以下是我的主要功能。我正在使用hadoop1.0.4运行以下代码。在我尝试压缩减速机的输出之前，它工作得很好。我将编译错误与代码一起粘贴：

public static void main(String[] args) throws Exception
{
    Configuration conf = new Configuration();

    conf.set("xmlinput.start", "<page>");
    conf.set("xmlinput.end", "</page>");
    Job job = new Job(conf);  //configure the job, submit it, control its execution, and query the state
    job.setJarByClass(XmlParser11.class); //set jar by finding where the class came from
    job.setOutputKeyClass(Text.class); //Set the key class for the job output data
    job.setOutputValueClass(Text.class);

    //job.setCompressMapOutput(true);
    //job.setMapOutputCompressorClass(GzipCodec.class);

    //job.setCompressOutput(job, true);
    //job.setClass("mapred.output.compression.codec", GzipCodec.class,CompressionCodec.class);
    job.setMapperClass(XmlParser11.Map.class);
    job.setReducerClass(XmlParser11.Reduce.class);

    job.setInputFormatClass(XmlInputFormat1.class);  //Set the InputFormat for the job                job.setOutputFormatClass(TextOutputFormat.class); //Set the OutputFormat for the job
    FileOutputFormat.setCompressOutput(job,true);
    FileOutputFormat.setOutputCompressorClass(job,GzipCodec.class);
    FileInputFormat.addInputPath(job, new Path(args[0])); //the job for which the input path should be modified                FileOutputFormat.setOutputPath(job, new Path(args[1]));
    job.waitForCompletion(true);       
}

[ravisg@topsail-sn ~]$ javac -classpath /var/hadoop/hadoop-core-1.0.4.jar -d stopWords/ XmlParser11.java
 XmlParser11.java:306: error: cannot find symbol
        FileOutputFormat.setOutputCompressorClass(job,GzipCodec.class);
                                                      ^
 symbol:   class GzipCodec
 location: class XmlParser11

你能告诉我如何压缩减速机的输出吗？或者你能指出我做得不对吗？我尝试使用stackoverflow上建议的不同压缩样式，但总是遇到类似的错误。

Java hadoop mapreduce GZIP compression

来源：https://stackoverflow.com/questions/19203444/error-while-using-gzip-codec-to-compress-output-from-reducer-in-hadoop-mapreduce

2条答案

按热度按时间

1qczuiv01#

编译代码时，需要将hadoop发行版中的hadoop common*jar添加到类路径中

赞(0）回复(0）举报 2021-06-03

idfiyjo82#

对不起，我得用一下

FileOutputFormat.setOutputCompressorClass(job, org.apache.hadoop.io.compress.GzipCodec.class

而不是

FileOutputFormat.setOutputCompressorClass(job, GzipCodec.class);

赞(0）回复(0）举报 2021-06-03

我来回答

在hadoop mapreduce中使用gzip编解码器压缩reducer的输出时出错

2条答案

相关问题

热门标签

最新问答