我正在尝试编译以下github项目https://github.com/digitalpebble/behemoth/tree/master/uima
我得到以下错误java.lang.classcastexception:org.apache.hadoop.io.longwritable不能转换为org.apache.hadoop.io.text
代码定义了以下输出键和值类。其中,behemothdocument是定义的自定义类
job.setInputFormat(SequenceFileInputFormat.class);
job.setOutputFormat(SequenceFileOutputFormat.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(BehemothDocument.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(BehemothDocument.class);
map类如下所示
public class UIMAMapper extends MapReduceBase implements
Mapper<Text, BehemothDocument, Text, BehemothDocument> {
Map函数如下
public void map(Text id, BehemothDocument behemoth,
OutputCollector<Text, BehemothDocument> output, Reporter reporter)
对于堆栈溢出中的上述错误,我已经看到了几个答案,它们要求更改Map器键、值类型,而我不想这样做。我想知道如何使用自定义类。
请帮忙。下面是堆栈跟踪信息
java.lang.Exception: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.Text
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
Caused by: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.Text
at UIMAPackage.UIMAMapper.map(UIMAMapper.java:35)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
1条答案
按热度按时间fquxozlt1#
使用longwritable作为Map器的输入键类型,而不是文本。应该有用。