hadoop返回mapper的输出而不是reducer

kulphzqa  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(329)

我正在编写这个hadoop代码,但是我不明白为什么它不生成reducer输出,而是准确地输出mapper的结果。我已经玩了很长时间的代码,测试不同的输出,但没有运气。
我的自定义Map器:

public static class UserMapper extends Mapper<Object, Text, Text, Text> {
    private final static IntWritable one = new IntWritable(1);
    private Text userid = new Text();
    private Text catid = new Text();

    /* map method */
    public void map(Object key, Text value, Context context)
                throws IOException, InterruptedException {
        StringTokenizer itr = new StringTokenizer(value.toString(), ","); /* separated by "," */
        int count = 0;

        userid.set(itr.nextToken());

        while (itr.hasMoreTokens()) {
            if (++count == 4) {
                // catid.set(itr.nextToken());
                catid.set("This is a test");
                context.write(userid, catid);
            }else {
                itr.nextToken();

            }
        }
    }
}

我的定制减速机:

/* Reducer Class */
public static class UserReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
private IntWritable result = new IntWritable();

public void reduce(Text key, Iterable<Text> values, Context context)
            throws IOException, InterruptedException {
    int sum = 0;
    for (Text val : values) {
        sum += 1; //val.get();
    }
    result.set(0);
    context.write(key, result);
    }
}

主程序主体:

Job job = new Job(conf, "User Popular Categories");
job.setJarByClass(popularCategories.class);
job.setMapperClass(UserMapper.class);
job.setCombinerClass(UserReducer.class);
job.setReducerClass(UserReducer.class);

job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setNumReduceTasks(2);

FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));

System.exit(job.waitForCompletion(true) ? 0 : 1);

以及输出文件:/user/hduser/output/part-r-00000

0001be1731ee7d1c519bc7e87110c9eb880cb396    This is a test
0001bfa0c494c01f9f8c141c476c11bb4625a746    This is a test
0002bd9c3d654698bb514194c4f4171ad6992266    This is a test
000433e0ef411c2cb8ee1727002d6ba15fe9426b    This is a test
00051f5350f4d9f3f4f5ba181b0a66d749b161ee    This is a test
00066c85bf96469b905e2fb148095448797b2368    This is a test
0007b1a0334de785b3189b67bb73276d602fb7d4    This is a test
0007d018861d588e99e834fc29ca76a523b20e35    This is a test
000992b67ed22d2707ba65046d523ce66dfcfcb8    This is a test
000ad93a0819e2cbd7f0193e1d1ec481a0241b44    This is a test
9rnv2umw

9rnv2umw1#

我仍然很惊讶上面的代码块是如何为您工作的。就像在hadoop(java)中关于mapper的另一个问题中更改mapper输出值的类型一样,这里应该会出现异常。
似乎输出是Map器而不是还原器。你确定那个文件名?

/user/hduser/output/part-r-00000 

instead of  

 /user/hduser/output/part-m-00000

Map器输出应该是还原器输入。

public static class UserMapper extends Mapper<Object, Text, Text, Text> {

将输出键写入 Text 输出值为 Text .
你的 Reducer 定义为

public static class UserReducer extends Reducer<Text, IntWritable, Text, IntWritable> {

表示输入键为文本(正确) but value is wrongly made as IntWritable ( It should be Text) 将声明更改为

public static class UserReducer extends Reducer<Text, Text, Text, IntWritable> {

并在中设置参数 Driver 相应地编程。

相关问题