mapreduce:结果不完整

qacovj5a  于 2021-06-04  发布在  Hadoop
关注(0)|答案(2)|浏览(508)

wcin\u文件的内容:

Run 1
access 1
default 2
out 2
project 1
task 1
windows 1
your 1

我想使用mapreduce按第二个fild对wcin\u文件中的这些数据进行降序排序,如下所示:

default 2
out 2
access 1
...

但我发现输出文件只包含两行:

default 2
Run     1

为什么?以下是一些源代码:
sortlogsmapper公司

public static class SortLogsMapper extends
            Mapper<Object, Text, Text, IntWritable> {

        public void map(Object key, Text value, Context context)
                throws IOException, InterruptedException {

            context.write(value, new IntWritable(0)); //the content of value is just every line, just as `Run 1`, `access 1` etc.
        }
    }

sortlogsreducer公司

public static class SortLogsReducer extends
        Reducer<Text, IntWritable, Text, IntWritable> {
    private Text k = new Text();
    private IntWritable v = new IntWritable();
    public void reduce(Text key, Iterable<IntWritable> values,
        Context context) throws IOException, InterruptedException {

        k.set(key.toString().split(" ")[0]); //split to get the first filed
        v.set(Integer.parseInt(key.toString().split(" ")[1]));  //second filed
        context.write(k, v);
    }
}

对数比较器

public static class LogDescComparator extends WritableComparator {
    protected LogDescComparator() {
        super(Text.class, true);
    }

    @Override
    public int compare(WritableComparable w1, WritableComparable w2) {

        Text t1 = (Text) w1;
        Text t2 = (Text) w2;
        String[] t1Items = t1.toString().split("\t| ");
        String[] t2Items = t2.toString().split("\t| ");
        Integer t1Value = Integer.parseInt(t1Items[1]);
        Integer t2Value = Integer.parseInt(t2Items[1]);
        int comp = t2Value.compareTo(t1Value);

        return comp;

然后我开始了主要职能部门的工作:

Job job2 = new Job(conf2, "sort");
job2.setNumReduceTasks(1);
job2.setJarByClass(WordCount.class);
job2.setMapperClass(SortLogsMapper.class);
job2.setReducerClass(SortLogsReducer.class);
job2.setSortComparatorClass(LogDescComparator.class);
job2.setOutputKeyClass(Text.class);
job2.setOutputValueClass(IntWritable.class);
FileInputFormat.setInputPaths(job2, new Path("wcin_file"));
FileOutputFormat.setOutputPath(job2, new Path("wcout"));
System.exit(job2.waitForCompletion(true) ? 0 : 1);
vsmadaxz

vsmadaxz1#

在logdesccomparator文件中,如果变量comp等于0,则不会打印该值。添加一些代码来处理comp等于0的情况。

yacmzcpb

yacmzcpb2#

现在,Map器正在输出以下键值对:
('some number''键,0)
正在尝试让Map程序拆分值和输出:
(键“some number”)
重新编写比较器,以便根据Map器输出中的key first then value进行比较(可能已经有了一个预定义的比较器)。
然后您的reducer应该接收键和值列表。迭代此值列表:
(键,值)
你在减速机上做了大部分的工作,如果不是全部的话。请尝试使用我在这里描述的Map器。

相关问题