hadoop map-reduce编程

kq0g1dla 于 2021-06-04 发布在 Hadoop

关注(0)|答案(1)|浏览(331)

我是hadoop map reduce的新手。我的输入是许多文本文件，我想编写map reduce程序，以便它将所有文件名和与文件名相关联的语句写入一个输出文件中，我只想从Map器发出文件名（key）和相关联的语句（value），reducer将收集key和所有值，并编写输出中的文件名及其相关语句。
Map器和还原器：

public void map(Text key, Text value,
                OutputCollector<Text, Text> output,
                Reporter reporter) throws IOException {
    StringTokenizer itr = new StringTokenizer(value.toString(), ",");
    String filename = new String();
    FileSplit filesplit = (FileSplit) reporter.getInputSplit();
    filename = filesplit.getpath().getName();
    while (itr.hasMoreTokens()) {
        word.set(itr.nextToken());
        output.collect(new Text(filename), word);
    }
}

public void reduce(Text key, Iterator<Text> values,
                   OutputCollector<Text, Text> output,
                   Reporter reporter) throws IOException {
    // int sum = 0;
    String translation = "";
    while (values.hasNext()) {
        translation += "|" + values.toString() + "|";
    }

    results.set(translation);
    output.collect(key, results);
}

当我使用相同配置的inputformat（keyvaluetextinputformat.class）运行上述Map器和reducer时，它不会在输出中写入任何内容。
我应该改变什么来实现我的目标？

Java hadoop mapreduce

来源：https://stackoverflow.com/questions/22406490/hadoop-map-reduce-programming

1条答案

按热度按时间

qyyhg6bp1#

在reduce方法中，将值声明为迭代器。它应该声明为iterable。

public void reduce(Text key, Iterable<Text> values, ....

而不是

public void reduce(Text key, Iterator<Text> values, ....

一旦你做到了，你就可以做到：

Iterator<Text> iter = values.iterator();
while(iter.hasNext())
{
    translation += "|" + iter.next().toString() + "|";
}

因为您使用了错误的类型，所以该方法没有重写默认的reduce方法，而reduce方法不起任何作用。这就是为什么你没有产出。
我也不知道在哪里声明变量结果。

赞(0）回复(0）举报 2021-06-04

我来回答

hadoop map-reduce编程

1条答案

相关问题

热门标签

最新问答