error-map中的键类型不匹配:应为org.apache.hadoop.io.text,收到org.apache.hadoop.io.longwritable

bttbmeg0  于 2021-06-01  发布在  Hadoop
关注(0)|答案(1)|浏览(317)

我试着用java编写一个mapreduce代码,下面是我的文件。
Map器类(bmapper):

public class bmapper extends Mapper<LongWritable,Text,Text,NullWritable>{
    private String txt=new String();
    public void mapper(LongWritable key,Text value,Context context) 
        throws IOException, InterruptedException{

        String str =value.toString(); 
        int index1 = str.indexOf("TABLE OF CONTENTS");
        int index2 = str.indexOf("</table>");
        int index3 = str.indexOf("MANAGEMENT'S DISCUSSION AND ANALYSIS");

        if(index1 == -1)    
        {   txt ="nil";
        }
        else
        {
           if(index1<index3 && index2>index3)
           {
               int index4 = index3+ 109;
              int pageno =str.charAt(index4);
              String[] pages =str.split("<page>");
             txt = pages[pageno+1];
           }
           else
           {
               txt ="nil";

           }
        }

        context.write(new Text(txt), NullWritable.get());
    } 

}

减速器等级(减速器):

public class breducer extends Reducer<Text,NullWritable,Text,NullWritable>{

    public void reducer(Text key,NullWritable value,Context context) throws IOException,InterruptedException{

        context.write(key, value);

    }

}

驾驶员等级(bdriver):

public class bdriver {

    public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException {
        Configuration conf = new Configuration();
        Job job = new Job(conf);
        job.setJobName("black coffer");
        job.setJarByClass(bdriver.class);
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(NullWritable.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(NullWritable.class);
        job.setReducerClass(breducer.class);
        job.setMapperClass(bmapper.class);
        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);
        FileInputFormat.setInputPaths(job, new Path[]{new Path(args[0])});
        FileOutputFormat.setOutputPath(job, new Path(args[1]));
        job.waitForCompletion(true);
  }
}

`
我有以下错误。

[training@localhost ~]$ hadoop jar blackcoffer.jar com.test.bdriver /page1.txt /MROUT4
18/03/16 04:38:56 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
18/03/16 04:38:57 INFO input.FileInputFormat: Total input paths to process : 1
18/03/16 04:38:57 WARN snappy.LoadSnappy: Snappy native library is available
18/03/16 04:38:57 INFO util.NativeCodeLoader: Loaded the native-hadoop library
18/03/16 04:38:57 INFO snappy.LoadSnappy: Snappy native library loaded
18/03/16 04:38:57 INFO mapred.JobClient: Running job: job_201803151041_0007
18/03/16 04:38:58 INFO mapred.JobClient:  map 0% reduce 0%
18/03/16 04:39:03 INFO mapred.JobClient: Task Id : attempt_201803151041_0007_m_000000_0, Status : FAILED
java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:871)
        at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:574)
        at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
        at org.apache.hadoop.mapreduce.Mapper.map(Mapper.java:124)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
        at java.security.AccessController.doPrivileged(Native Method)

我认为它无法找到Map器和还原器类。我已经在主类中编写了代码,它是默认的Map器和还原器类

nukf8bse

nukf8bse1#

您的输入/输出类型似乎与作业配置兼容。
在此处添加问题细节和解决方案(根据评论中的讨论,op确认问题已解决)。
根据javadoc,reducer的reduce方法具有以下签名

protected void reduce(KEYIN key,
          Iterable<VALUEIN> values,
          org.apache.hadoop.mapreduce.Reducer.Context context)
               throws IOException,
                      InterruptedException

根据它,减速机应该是

public class breducer extends Reducer<Text,NullWritable,Text,NullWritable>{
    @Overwrite
    public void reducer(Text key,Iterable<NullWritable> value,Context context) throws IOException,InterruptedException{
        // Your logic
    }
}

问题是因为 map() 以及 reduce() 方法,这些方法实际上没有 overriden . 只是 overloading 相同的方法名。
这个问题是在 @Override 上的注解 map() 以及 reduce() 功能。虽然它不是强制性的,但作为一种最佳实践,总是添加 @Override 已实现的注解 map() 以及 reduce() 方法。

相关问题