hadoop与eclipse中reducer的数量

oipij1gg 于 2021-06-04 发布在 Hadoop

关注(0)|答案(2)|浏览(293)

在mapreduce程序中，我必须使用分区器：

public class TweetPartitionner extends HashPartitioner<Text, IntWritable>{

    public int getPartition(Text a_key, IntWritable a_value, int a_nbPartitions) {
        if(a_key.toString().startsWith("#"))
            return 0;
        else
            return 1;
    }

}

我已经设置了reduce任务的数量： job.setNumReduceTasks(2); 但我得到以下错误： java.io.IOException: Illegal partition for #rescinfo (1) 参数 a_nbPartitions 退货 1 .
我在另一篇文章中读到过：hadoop:reducer的数量不等于我在那个程序中设置的数量
在eclipse中运行它似乎使用本地job runner。它只支持0或1个异径管。如果您试图将它设置为使用多个减速器，它会忽略它，并且只使用一个减速器。
我在安装在cygwin上的hadoop0.20.2上开发，当然我使用eclipse。我该怎么办？

hadoop mapreduce eclipse

来源：https://stackoverflow.com/questions/17298659/hadoop-and-number-of-reducers-in-eclipse

2条答案

按热度按时间

vwkv1x7d1#

除非有一个专用的hadoop集群来运行作业，否则在本地模式下不可能有超过1个reducer。不过，您可以将eclipse配置为将作业提交到hadoop集群，然后将考虑您的配置。
在任何情况下，你都应该使用return Math.min(i, a_nbPartitions-1) 在写你自己的分区的时候。

赞(0）回复(0）举报 2021-06-04

o3imoua42#

实际上，你不需要一个专用的hadoop集群。只是您必须告诉eclipse您打算在伪分布式集群上运行这个作业，而不是在集群内部本地运行。为此，需要在代码中添加以下行：

Configuration conf = new Configuration();
conf.set("fs.default.name", "hdfs://localhost:9000");
conf.set("mapred.job.tracker", "localhost:9001");

之后，通过以下方式将减速器的数量设置为2：

job.setNumReduceTasks(2);

是的，你必须非常确定你的分区逻辑。您可以访问此页面，该页面显示如何编写自定义分区器。
hth公司

赞(0）回复(0）举报 2021-06-04

我来回答

hadoop与eclipse中reducer的数量

2条答案

相关问题

热门标签

最新问答