我使用hadoop1.2.1并执行一个“map only”作业,基本上将日志条目Map到mysql表中。提取的字段之一是ip地址,有时它比表中的列长度太长,并且发生ioexception。尽管map函数中有try-catch子句,但我无法捕获和处理它。代码如下:
public class LogEntriesMapper extends
Mapper<Object, Text, LogEntry, NullDBWritable> {
private static Pattern p1 = Pattern
.compile([…]);
private final static NullDBWritable nullDB = new NullDBWritable();
private LogEntry logEntry = new LogEntry();
@Override
protected void setup(Context context) throws IOException,
InterruptedException {
super.setup(context);
p1 = Pattern.compile([...);
}
@Override
protected void map(Object key, Text value, Context context)
throws IOException, InterruptedException {
String entry = value.toString();
Matcher matcher = p1.matcher(entry);
if (matcher.find()) {
String date = ...
String ip = ...
[extracting fields]
logEntry.setDate(date);
logEntry.setIp(ip);
logEntry.setClient(client);
logEntry.setSession(session);
logEntry.setReal_time(real_time);
try {
context.write(logEntry, nullDB);
} catch (IOException e) {
System.out.println("Failed to save entry: " + logEntry);
System.out.println(e.getMessage());
}
}
}
}
和系统日志:
2014-01-07 15:38:08908警告org.apache.hadoop.util.nativecodeloader:无法为您的平台加载本机hadoop库。。。在适用的情况下使用内置java类
2014-01-07 15:38:09233警告org.apache.hadoop.metrics2.impl.metricssystemimpl:源名称ugi已经存在!
2014-01-07 15:38:09330 info org.apache.hadoop.mapred.task:使用resourcecalculatorplugin:null
2014-01-07 15:38:09354 info org.apache.hadoop.mapred.maptask:处理剥离:hdfs://master:9000/logs/20130718。txt:0+49101
2014-01-07 15:38:09672 info com.hadoop.compression.lzo.gplnativecodeloader:加载的本机gpl库
2014-01-07 15:38:09675 info com.hadoop.compression.lzo.lzocodec:已成功加载并初始化本机lzo库[hadoop lzo rev fbd3aa777e0ad06bce75c6aff8c91c7c68eb596b]
2014-01-07 15:38:09788警告org.apache.hadoop.mapreduce.lib.db.dboutputformat:com.mysql.jdbc.exceptions.jdbc4.mysqlnontransientconnectionexception:关闭连接后不允许操作。在sun.reflect.nativeconstructoraccessorimpl.newinstance0(本机方法)在sun.reflect.nativeconstructoraccessorimpl.newinstance(nativeconstructoraccessorimpl)。java:57)在sun.reflect.delegatingconstructoraccessorimpl.newinstance(delegatingconstructoraccessorimpl。java:45)在java.lang.reflect.constructor.newinstance(constructor。java:526)在com.mysql.jdbc.util.handlenewinstance(util。java:411)在com.mysql.jdbc.util.getinstance(util。java:386)在com.mysql.jdbc.sqlerror.createsqlexception(sqlerror。java:1015)在com.mysql.jdbc.sqlerror.createsqlexception(sqlerror。java:989)在com.mysql.jdbc.sqlerror.createsqlexception(sqlerror。java:975)在com.mysql.jdbc.sqlerror.createsqlexception(sqlerror。java:920)在com.mysql.jdbc.connectionimpl.throwconnectionclosedexception(connectionimpl。java:1304)在com.mysql.jdbc.connectionimpl.checkclosed(connectionimpl。java:1296)在com.mysql.jdbc.connectionimpl.rollback(connectionimpl。java:5028)在org.apache.hadoop.mapreduce.lib.db.dboutputformat$dbrecordwriter.close(dboutputformat)。java:98)在org.apache.hadoop.mapred.maptask$newdirectoutputcollector.close(maptask。java:650)在org.apache.hadoop.mapred.maptask.closequiety(maptask。java:1793)在org.apache.hadoop.mapred.maptask.runnewmapper(maptask。java:779)在org.apache.hadoop.mapred.maptask.run(maptask。java:364)在org.apache.hadoop.mapred.child$4.run(child。java:255)位于javax.security.auth.subject.doas(subject)的java.security.accesscontroller.doprivileged(本机方法)。java:415)在org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation。java:1190)在org.apache.hadoop.mapred.child.main(child。java:249)
2014-01-07 15:38:09789信息org.apache.hadoop.mapred.maptask:关闭org.apache.hadoop.mapred时忽略异常。maptask$newdirectoutputcollector@6dac133e java.io.ioexception:语句关闭后不允许执行任何操作。在org.apache.hadoop.mapreduce.lib.db.dboutputformat$dbrecordwriter.close(dboutputformat。java:103)在org.apache.hadoop.mapred.maptask$newdirectoutputcollector.close(maptask。java:650)在org.apache.hadoop.mapred.maptask.closequiety(maptask。java:1793)在org.apache.hadoop.mapred.maptask.runnewmapper(maptask。java:779)在org.apache.hadoop.mapred.maptask.run(maptask。java:364)在org.apache.hadoop.mapred.child$4.run(child。java:255)位于javax.security.auth.subject.doas(subject)的java.security.accesscontroller.doprivileged(本机方法)。java:415)在org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation。java:1190)在org.apache.hadoop.mapred.child.main(child。java:249)
2014-01-07 15:38:09823 info org.apache.hadoop.mapred.tasklogstruncater:使用mapretainsize=-1和reduceretainsize=-1初始化日志截断程序
2014-01-07 15:38:09855错误org.apache.hadoop.security.usergroupinformation:priviledgedactionexception as:mateuszmurawskicause:java.io.ioexception:数据截断:第1行“ip”列的数据太长
2014-01-07 15:38:09,855 warn org.apache.hadoop.mapred.child:运行child java.io.ioexception时出错:数据截断:org.apache.hadoop.mapreduce.lib.db.dboutputformat$dbrecordwriter.close(dboutputformat)第1行的列“ip”的数据太长。java:103)在org.apache.hadoop.mapred.maptask$newdirectoutputcollector.close(maptask。java:650)在org.apache.hadoop.mapred.maptask.runnewmapper(maptask。java:767)在org.apache.hadoop.mapred.maptask.run(maptask。java:364)在org.apache.hadoop.mapred.child$4.run(child。java:255)位于javax.security.auth.subject.doas(subject)的java.security.accesscontroller.doprivileged(本机方法)。java:415)在org.apache.hadoop.security.usergroupinformation.doas(用户组信息。java:1190)在org.apache.hadoop.mapred.child.main(child。java:249)
2014-01-07 15:38:09861 info org.apache.hadoop.mapred.task:为任务运行清理
暂无答案!
目前还没有任何答案,快来回答吧!