将reducer设置为默认值，但最后我有两个文件

mefy6pfw 于 2021-05-29 发布在 Hadoop

关注(0)|答案(2)|浏览(361)

我正在运行一个map reduce作业，将reducer的数量设置为默认值（一个reducer）。理论上，每个减速机的输出必须是一个文件，但是当我运行作业时，我有两个文件
第r-00000部分
以及
零件号：r-00001
为什么会这样？
我的群集中只有一个节点。
我的驾驶员等级：

public class DriverDate extends Configured implements Tool {

    @Override
    public int run(String[] args) throws Exception {
        if (args.length != 2) {
            System.out.printf("Usage: AvgWordLength inputDir outputDir\n");
            System.exit(-1);
        }
            Job job = new Job(getConf());
            job.setJobName("Job transformacio dates");

            job.setJarByClass(DriverDate.class);
            job.setMapperClass(MapDate.class);
            job.setReducerClass(ReduceDate.class);

            job.setMapOutputKeyClass(Text.class);
            job.setMapOutputValueClass(NullWritable.class);

            job.setOutputKeyClass(Text.class);
            job.setOutputValueClass(NullWritable.class);

            FileInputFormat.setInputPaths(job, new Path(args[0]));

            FileOutputFormat.setOutputPath(job, new Path(args[1]));

            job.waitForCompletion(true);

        return 0;
    }

    public static void main(String[] args) throws Exception{
        Configuration conf = new Configuration();
        ToolRunner.run(conf,new DriverDate(), args);
    }

}

hadoop mapreduce

来源：https://stackoverflow.com/questions/31513510/set-reducers-to-default-but-finally-i-have-two-files

2条答案

按热度按时间

bjp0bcyl1#

这段代码应该生成一个输出文件，这是对的，因为reduce任务的默认数目是1，每个reducer生成一个输出文件。
但是，可能出错的情况包括（但不限于）：
确保在生成jar时运行正确的jar并更新正确的jar。确保从生成jar的计算机将正确的jar复制到（单节点）集群的主节点。例如，在你的指示中你说 Usage: AvgWordLength inputDir outputDir ，但这个jar的名称不太可能是avgwordlength。。。
确保没有从命令行中指定不同数量的缩减器（例如，通过使用-d属性）。
除此之外，我找不到其他可能的原因。。。
群集中的节点数是不相关的。

赞(0）回复(0）举报 2021-05-30

yx2lnoni2#

好的，我找到答案了。
在cloudera管理器中，yarn（mr2）中的configuration选项具有每个作业的reducers task的默认值，在一个节点集群中设置为2，因此默认reducer的数量为2。
为了解决这个问题，有两个选项，通过java将reducer的数量显式地设置为1：
作业集（1）；
，或在cloudera manager的Yarn配置中更改默认异径管的值

赞(0）回复(0）举报 2021-05-30

我来回答

将reducer设置为默认值，但最后我有两个文件

2条答案

相关问题

热门标签

最新问答