mapreduce按值降序排序

myzjeezk  于 2021-06-01  发布在  Hadoop
关注(0)|答案(1)|浏览(605)

我试图用伪代码编写一个mapreduce任务,返回按降序排序的项。例如:对于wordcount任务,而不是获取:

apple 1
banana 3
mango 2

我希望输出为:

banana 3
mango 2
apple 1

你知道怎么做吗?我知道如何按升序进行(替换mapper作业中的键和值),但不知道按降序进行。

ki1q1bka

ki1q1bka1#

在这里你可以借助下面的代码来实现降序排序。
假设您已经编写了Map器和驱动程序代码,其中Map器将生成输出(banana,1)等
在reducer中,我们将对特定键的所有值求和,并将最终结果放入Map中,然后根据值对Map进行排序,并将最终结果写入reduce的cleanup函数中。
请参阅以下代码以了解更多信息:

public class Word_Reducer extends Reducer<Text, IntWritable, Text, IntWritable> {
    // Change access modifier as per your need 
    public Map<String , Integer > map = new LinkedHashMap<String , Integer>();
    public void reduce(Text key , Iterable<IntWritable> values ,Context context)
    { 
    // write logic for your reducer 
    // Enter reduced values in map for each key
    for (IntWritable value : values ){
         // calculate "count" associated with each word 
    }
    map.put(key.toString() , count); 
}

public void cleanup(Context context){ 
    //Cleanup is called once at the end to finish off anything for reducer
    //Here we will write our final output
    Map<String , Integer>  sortedMap = new HashMap<String , Integer>();    
    sortedMap = sortMap(map);

    for (Map.Entry<String,Integer> entry = sortedMap.entrySet()){
        context.write(new Text(entry.getKey()),new IntWritable(entry.getValue()));
    }
}

public Map<String , Integer > sortMap (Map<String,Integer> unsortMap){
    Map<String ,Integer> hashmap = new LinkedHashMap<String,Integer>();
    int count=0;
    List<Map.Entry<String,Integer>> list = new 
    LinkedList<Map.Entry<String,Integer>>(unsortMap.entrySet());
    //Sorting the list we created from unsorted Map
    Collections.sort(list , new Comparator<Map.Entry<String,Integer>>(){
        public int compare (Map.Entry<String , Integer> o1 , Map.Entry<String , Integer> o2 ){
            //sorting in descending order
            return o2.getValue().compareTo(o1.getValue());
        }
    });

    for(Map.Entry<String, Integer> entry : list){
        // only writing top 3 in the sorted map 
        if(count>2)
            break;
        hashmap.put(entry.getKey(),entry.getValue());
    }
    return hashmap ; 
}

相关问题