在mapreduce中排序生成额外值

uhry853o  于 2021-06-03  发布在  Hadoop
关注(0)|答案(1)|浏览(241)

我正在尝试按以下顺序对一系列整数进行排序:

A    2
B    9
C    4
....
....
Z    42

以下是Map器和缩减器代码:

public static class MapClass extends MapReduceBase implements Mapper<Text, Text, IntWritable, Text>
    {
        public void map(Text key, Text value, OutputCollector<IntWritable, Text> output, Reporter reporter) throws IOException
        {
            output.collect(new IntWritable(Integer.parseInt(value.toString())), key);
        }
    }

    public static class Reduce extends MapReduceBase implements Reducer<IntWritable, Text, IntWritable, Text>
    {
        public void reduce(IntWritable key, Iterator<Text> values, OutputCollector<IntWritable, Text> output, Reporter reporter) throws IOException
        {
            output.collect(key, new Text(""));
        }
    }

但是输出会产生很多额外的整数。有人能告诉我密码有什么问题吗?
另外,如果可能的话,请给我一个使用mapreduce的好整数排序示例。
编辑:

job.setInputFormat(KeyValueTextInputFormat.class);
job.setOutputFormat(TextOutputFormat.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(Text.class);
xt0899hw

xt0899hw1#

我试着按照你的逻辑,但使用新的API。结果是正确的。
注:reduce(…)函数的第二个参数是 **Iterable**<Text> ```
package stackoverflow;

import java.io.IOException;
import java.util.Iterator;

import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.KeyValueTextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class q18076708 extends Configured implements Tool {
static class MapClass extends Mapper<Text, Text, IntWritable, Text> {
public void map(Text key, Text value, Context context)
throws IOException, InterruptedException {
context.write(new IntWritable(Integer.parseInt(value.toString())),
key);
}

}

static class Reduce extends Reducer<IntWritable, Text, IntWritable, Text> {
    static int xxx = -1;
    @Override
    public void reduce(IntWritable key,**Iterable**<Text> values,
            Context context) throws IOException, InterruptedException {
        context.write(key, new Text(""));
    }

}

public int run(String[] args) throws Exception {

    getConf().set("fs.default.name", "file:///");
    getConf().set("mapred.job.tracker", "local");
    Job job = new Job(getConf(), "Logging job");
    job.setJarByClass(getClass());

    FileInputFormat.addInputPath(job, new Path("src/test/resources/testinput.txt"));
    FileSystem.get(getConf()).delete(new Path("target/out"), true);
    FileOutputFormat.setOutputPath(job, new Path("target/out"));

    job.setMapperClass(MapClass.class);
    job.setMapOutputKeyClass(IntWritable.class);
    job.setMapOutputValueClass(Text.class);

    job.setCombinerClass(Reduce.class);
    job.setReducerClass(Reduce.class);

    job.setInputFormatClass(KeyValueTextInputFormat.class);
    job.setOutputFormatClass(TextOutputFormat.class);

    job.setOutputKeyClass(IntWritable.class);
    job.setOutputValueClass(Text.class);

    return job.waitForCompletion(true) ? 0 : 1;
}

public static void main(String[] args) throws Exception {

    int exitCode = ToolRunner.run(new q18076708(), args);
    System.exit(exitCode);
}

}

输入:

A 2
B 9
C 4
Z 42

输出:

2
4
9
42

相关问题