我正在编写这个hadoop代码,但是我不明白为什么它不生成reducer输出,而是准确地输出mapper的结果。我已经玩了很长时间的代码,测试不同的输出,但没有运气。
我的自定义Map器:
public static class UserMapper extends Mapper<Object, Text, Text, Text> {
private final static IntWritable one = new IntWritable(1);
private Text userid = new Text();
private Text catid = new Text();
/* map method */
public void map(Object key, Text value, Context context)
throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString(), ","); /* separated by "," */
int count = 0;
userid.set(itr.nextToken());
while (itr.hasMoreTokens()) {
if (++count == 4) {
// catid.set(itr.nextToken());
catid.set("This is a test");
context.write(userid, catid);
}else {
itr.nextToken();
}
}
}
}
我的定制减速机:
/* Reducer Class */
public static class UserReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
int sum = 0;
for (Text val : values) {
sum += 1; //val.get();
}
result.set(0);
context.write(key, result);
}
}
主程序主体:
Job job = new Job(conf, "User Popular Categories");
job.setJarByClass(popularCategories.class);
job.setMapperClass(UserMapper.class);
job.setCombinerClass(UserReducer.class);
job.setReducerClass(UserReducer.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setNumReduceTasks(2);
FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
以及输出文件:/user/hduser/output/part-r-00000
0001be1731ee7d1c519bc7e87110c9eb880cb396 This is a test
0001bfa0c494c01f9f8c141c476c11bb4625a746 This is a test
0002bd9c3d654698bb514194c4f4171ad6992266 This is a test
000433e0ef411c2cb8ee1727002d6ba15fe9426b This is a test
00051f5350f4d9f3f4f5ba181b0a66d749b161ee This is a test
00066c85bf96469b905e2fb148095448797b2368 This is a test
0007b1a0334de785b3189b67bb73276d602fb7d4 This is a test
0007d018861d588e99e834fc29ca76a523b20e35 This is a test
000992b67ed22d2707ba65046d523ce66dfcfcb8 This is a test
000ad93a0819e2cbd7f0193e1d1ec481a0241b44 This is a test
1条答案
按热度按时间9rnv2umw1#
我仍然很惊讶上面的代码块是如何为您工作的。就像在hadoop(java)中关于mapper的另一个问题中更改mapper输出值的类型一样,这里应该会出现异常。
似乎输出是Map器而不是还原器。你确定那个文件名?
Map器输出应该是还原器输入。
将输出键写入
Text
输出值为Text
.你的
Reducer
定义为表示输入键为文本(正确)
but value is wrongly made as IntWritable ( It should be Text)
将声明更改为并在中设置参数
Driver
相应地编程。