我是hadoop mapreduce编程范例的新手,有人能告诉我如何根据值轻松排序吗?我尝试实现另一个comparator类,但是有没有一种更简单的方法,比如通过job config根据reducer的值进行排序。基本上,我正在阅读日志文件,我想按升序排序的命中率的网址。
public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable> {
private final static IntWritable ONE = new IntWritable(1);
private Text word = new Text();
public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
String[] split = value.toString().split(" ");
for(int i=0; i<split.length; i++){
if (i==6)
word.set(split[i]);
context.write(word, ONE);
}
}
}
public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWritable> {
private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
context.write(key, result);
}
}
2条答案
按热度按时间k5hmc34c1#
在这种情况下,您必须编写两个map reduce作业。第一个作业计算URL。就像第一个作业的输出一样-
将此传递给第二个map reduce作业并根据计数对其排序。
gcmastyq2#
在reducer类中声明一个map,并将键和值放在map中。现在在reducer类的cleanup()方法中,尝试按值对Map进行排序,最后在context.write(key,value)中给出值;