如何操作reduce()输出并将其存储在另一个文件中？

e0uiprwp 于 2021-06-03 发布在 Hadoop

关注(0)|答案(3)|浏览(396)

我刚刚开始学习hadoop。我想使用我的 reduce() 对它进行一些操作。我正在开发新的api，并尝试使用 JobControl ，但它似乎不适用于新的api。
有出路吗？

hadoop

来源：https://stackoverflow.com/questions/20863125/how-to-manipulate-reduce-output-and-store-it-in-another-file

3条答案

按热度按时间

hgb9j2n61#

不知道你想做什么。是否要将不同类型的输出发送到不同的输出格式？选中此选项如果要过滤或对贴图中的值进行操作，则reduce是执行此操作的最佳位置。

赞(0）回复(0）举报 2021-06-03

ckx4rj1h2#

你可以利用 ChainReducer 创建窗体的作业 [MAP+ / REDUCE MAP*] i、先是几个Map，然后是一个减速机，然后是另一系列Map，这些Map从处理减速机的输出开始。最终输出是系列中最后一个Map器的输出。
或者，您可以有多个按顺序开始的作业，前一个作业的减速机的输出是下一个作业的输入。但是，这会导致不必要的io，以防您对中间输出不感兴趣

赞(0）回复(0）举报 2021-06-03

wvt8vs2t3#

在reducer中执行任何操作，创建一个fsdataoutputstream并通过它写入输出。
例如：

public static class TokenCounterReducer extends
            Reducer<Text, IntWritable, Text, IntWritable> {
        public void reduce(Text key, Iterable<IntWritable> values,
                Context context) throws IOException, InterruptedException {

            FileSystem fs = FileSystem.get(context.getConfiguration());
            FSDataOutputStream out = fs.create(new Path("/path/to/your/file"));
            //do the manipulation and write it down to the file
            out.write(......);
            int sum = 0;
            for (IntWritable value : values) {
                sum += value.get();
            }
            context.write(key, new IntWritable(sum));
        }
    }

赞(0）回复(0）举报 2021-06-03

我来回答

如何操作reduce()输出并将其存储在另一个文件中？

3条答案

相关问题

热门标签

最新问答