mapreduce hadoop stringtokenizer获取nosuchelementexception

yeotifhr 于 2021-05-29 发布在 Hadoop

关注(0)|答案(1)|浏览(420)

我正在尝试使用wordcount的输出文件作为mapreduce的输入文件，它将显示每个count有多少个（有多少个单词出现一次、两次、三次等等）。
我想使用每个单词的计数作为键，1作为值，跳过单词本身。
如果输入文件是这样的：
422
苹果3
水果2
大猩猩9
猴子3
斑马12
输出应为：
2 1
3 2
9 1
12 1
使用stringtokenizer分解文件 nextToken() 在下面的map函数中 NoSuchElementException .

public static class TokenizerMapper
       extends Mapper<Object, Text, Text, IntWritable>{

    private final static IntWritable one = new IntWritable(1);
    private Text count = new Text();

    public void map(Object key, Text value, Context context
                    ) throws IOException, InterruptedException {
        StringTokenizer itr = new StringTokenizer(value.toString());
        itr.nextToken(); // Skip over first line, which has just one element
        while (itr.hasMoreTokens()) {
            itr.nextToken(); // Skip over word
            count.set(itr.nextToken()); // save count as key
            context.write(count, one);
        }
    }
}

我不知道为什么或者怎么修？

Java hadoop mapreduce bigdata stringtokenizer

来源：https://stackoverflow.com/questions/35390847/mapreduce-hadoop-stringtokenizer-getting-nosuchelementexception

1条答案

按热度按时间

swvgeqrz1#

欢迎使用stackoverflow joanne和mapreduce编程！
我猜原因是你总是跳过第一个代币，然后要下两个。你对每一行都这么做。
请记住，Map对于输入的不同部分是并行运行的，而不是顺序运行的，从第1行开始，然后到第2行。每次，stringtokenizer只为一行调用，而不是为整个输入调用。尽管如此，您的问题的解决方案如下：

public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
    StringTokenizer itr = new StringTokenizer(value.toString()); // each time the value is a different line
    if (itr.countTokens() == 2) { //this skips the first line and other lines that possible contain one word
        itr.nextToken(); // Skip over word
        count.set(itr.nextToken()); // save count as key
        context.write(count, one);
    }
}

ps1:您也可以使用string.split（）方法，但这取决于您自己。
ps2：你也可以考虑把密钥写成 IntWritable ，或 VIntWritable ，基于您的数据和需求（将字符串解析为int的速度较慢，但传输到网络的速度较快，内存消耗也较低）。

赞(0）回复(0）举报 2021-05-29

我来回答

mapreduce hadoop stringtokenizer获取nosuchelementexception

1条答案

相关问题

热门标签

最新问答