无法在wordcount hadoop中添加带单词的分隔符和文件名

m4pnthwp  于 2021-05-27  发布在  Hadoop
关注(0)|答案(0)|浏览(174)

这是mapreduce函数。我要做的是在wordcount类中将分隔符“#######################。当我运行代码时,没有得到任何更改。我只知道字数。我想要这个格式的最终结果(word#######filename)。请告诉我哪里做错了。

public static class Map extends Mapper<LongWritable ,  Text ,  Text ,  Text > 
   {
          private final static IntWritable one  = new IntWritable( 1);
          private Text word  = new Text();
          private static final Pattern WORD_BOUNDARY = Pattern .compile("\\s*\\b\\s*");

          public void map( LongWritable offset,  Text lineText,  Context context) throws  IOException,  InterruptedException 
          {
             String line  = lineText.toString();
             Text currentWord  = new Text();

             //These lines will split the files.
             InputSplit input_split = context.getInputSplit();
             String FName = ((FileSplit) input_split).getPath().getName();

             for ( String word  : WORD_BOUNDARY .split(line)) 
             {
                if (word.isEmpty()) 
                {
                   continue;
                }
                //These lines will add delimiters with the words & filename. 
                currentWord  = new Text(word + "####" + FName);
                context.write(new Text(currentWord.toString()), new Text (FName));
             }
          }
       }

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题