我试着用java编写一个mapreduce代码,下面是我的文件。
Map器类(bmapper):
public class bmapper extends Mapper<LongWritable,Text,Text,NullWritable>{
private String txt=new String();
public void mapper(LongWritable key,Text value,Context context)
throws IOException, InterruptedException{
String str =value.toString();
int index1 = str.indexOf("TABLE OF CONTENTS");
int index2 = str.indexOf("</table>");
int index3 = str.indexOf("MANAGEMENT'S DISCUSSION AND ANALYSIS");
if(index1 == -1)
{ txt ="nil";
}
else
{
if(index1<index3 && index2>index3)
{
int index4 = index3+ 109;
int pageno =str.charAt(index4);
String[] pages =str.split("<page>");
txt = pages[pageno+1];
}
else
{
txt ="nil";
}
}
context.write(new Text(txt), NullWritable.get());
}
}
减速器等级(减速器):
public class breducer extends Reducer<Text,NullWritable,Text,NullWritable>{
public void reducer(Text key,NullWritable value,Context context) throws IOException,InterruptedException{
context.write(key, value);
}
}
驾驶员等级(bdriver):
public class bdriver {
public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException {
Configuration conf = new Configuration();
Job job = new Job(conf);
job.setJobName("black coffer");
job.setJarByClass(bdriver.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(NullWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(NullWritable.class);
job.setReducerClass(breducer.class);
job.setMapperClass(bmapper.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.setInputPaths(job, new Path[]{new Path(args[0])});
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}
}
`
我有以下错误。
[training@localhost ~]$ hadoop jar blackcoffer.jar com.test.bdriver /page1.txt /MROUT4
18/03/16 04:38:56 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
18/03/16 04:38:57 INFO input.FileInputFormat: Total input paths to process : 1
18/03/16 04:38:57 WARN snappy.LoadSnappy: Snappy native library is available
18/03/16 04:38:57 INFO util.NativeCodeLoader: Loaded the native-hadoop library
18/03/16 04:38:57 INFO snappy.LoadSnappy: Snappy native library loaded
18/03/16 04:38:57 INFO mapred.JobClient: Running job: job_201803151041_0007
18/03/16 04:38:58 INFO mapred.JobClient: map 0% reduce 0%
18/03/16 04:39:03 INFO mapred.JobClient: Task Id : attempt_201803151041_0007_m_000000_0, Status : FAILED
java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:871)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:574)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at org.apache.hadoop.mapreduce.Mapper.map(Mapper.java:124)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
我认为它无法找到Map器和还原器类。我已经在主类中编写了代码,它是默认的Map器和还原器类
1条答案
按热度按时间nukf8bse1#
您的输入/输出类型似乎与作业配置兼容。
在此处添加问题细节和解决方案(根据评论中的讨论,op确认问题已解决)。
根据javadoc,reducer的reduce方法具有以下签名
根据它,减速机应该是
问题是因为
map()
以及reduce()
方法,这些方法实际上没有overriden
. 只是overloading
相同的方法名。这个问题是在
@Override
上的注解map()
以及reduce()
功能。虽然它不是强制性的,但作为一种最佳实践,总是添加@Override
已实现的注解map()
以及reduce()
方法。