map函数的输出记录为零-没有错误,但是mapper仍然没有给出任何输出(Map缩小)

zzlelutf  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(493)

尝试使用hadoop map reduce对文本进行处理以获取词性。没有错误,但Map函数仍不提供任何输出。

public class POSCount {
  public static class TokenizerMapper
   extends Mapper<LongWritable, Text, IntWritable, Text>{
//private final static IntWritable one = new IntWritable(1);
   private Text word = new Text();
   private  Map<String, String> wordList = null;

@Override
public void setup(Context context) {
    Configuration conf = context.getConfiguration();
    Path pt = new Path("/user/gokul/hw1b/mobyposi.i");
    //Path pt = new Path("/user/gxs161530/mobyposi.i");
    BufferedReader br;
    try {
    //FileSystem fs = FileSystem.get(new Configuration());
    FileSystem fs = FileSystem.get(conf);
    br=new BufferedReader(new InputStreamReader(fs.open(pt)));
    wordList = new HashMap<String, String>();
    String line, word, type;
    char ch;
      while ((line=br.readLine())!= null){
          word = line.substring(0,line.indexOf("×"));
          type = line.substring(line.indexOf("×")+1);
          for(int i=0;i<type.length();i++){
              ch = type.charAt(i);
              switch (ch){
                case 'N' :  wordList.put(word, "noun");
                            break;
                case 'p' :  wordList.put(word, "plural");
                            break;
                case 'V' :  wordList.put(word, "verb");
                            break;
                case 't' :  wordList.put(word, "verb");
                            break;
                case 'i' :  wordList.put(word, "verb");
                            break;
                case 'A' :  wordList.put(word, "adjective");
                            break;
                case 'v' :  wordList.put(word, "adverb");
                            break;
                case 'C' :  wordList.put(word, "conjunction");
                            break;
                case 'P' :  wordList.put(word, "preposition");
                            break;
                case 'r' :  wordList.put(word, "pronoun");
                            break;
                case 'D' :  wordList.put(word, "definite article");
                            break;
                case 'I' :  wordList.put(word, "indefinite article");
                            break;
                case 'o' :  wordList.put(word, "nominative");
                            break;
              }
         }
       }
    } catch(Exception e) {
      e.printStackTrace();
    }
//  return map;
}
public void map(LongWritable key, Text value, Context context
                ) throws IOException, InterruptedException {
  StringTokenizer itr = new StringTokenizer(value.toString());
  String token;
  int len=0;
  try {
     while (itr.hasMoreTokens()) {
         token = itr.nextToken().trim().toLowerCase();   
         len = token.length();
         if(wordList.containsKey(token) && len>=5){ 
            word.set(wordList.get(token));
            //context.write(new Text(Integer.toString(len)), word);
            context.write(new IntWritable(len), word);
            }
      }
} catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }     
 }

以上程序将Map输出记录返回为0。我需要(长度,词性)作为Map的输出。请告诉我哪里出错了。

mlmc2os5

mlmc2os51#

你的程序是好的,我运行它一次,对一些样本从文件中你提到的评论,它给了我预期的产出。但当我对整个文件运行同一个程序时,由于一些符号问题,它给了我一个错误。它无法以java支持的格式从输入文件中获取“x”。所以我把那个文件中的所有数据复制到新文件(stack.txt)中,然后再次运行这个程序,它给了我输出。
我使用了和你问题中给出的相同的Map器,这是我使用的还原器

// Do not go on logic it was just for testing
public void reduce(IntWritable key, Iterable<Text> values,Context context) throws IOException, InterruptedException {
Text t = new Text();
for(Text i : values)
{
t=i;
}   
    context.write(t, key);
}

这是我得到的结果

verb    5
verb    6
noun    7
noun    8
noun    9
adjective       10
noun    11
noun    12
noun    13
adjective       14
adjective       15
adjective       16
noun    17
noun    18
adjective       19

如果你还面临一些问题,请告诉我

相关问题