我正在将s3中的一个日志文件加载到运行在emr上的配置单元中,但是在查看数据时,我会将所有空值都取回。。。
我创建了如下表:
create external table coglogs (
HostID string,
ProcessID string,
Time string,
TimeZoneOffset string,
SessionID string,
RequestID string,
SubRequestID string,
StepID string,
Thread string,
Component string,
BuildNumber string,
Level string,
Logger string,
Operation string,
ObjectType string,
ObjectPath string,
Status string,
Message string,
LogData string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
"input.regex" = "([\d+]\S+[\d+])\t(\d+)\t([\d+]\S+[\d+] [\d+]\S+[\d+])\t(-[\d+])\t([a-zA-Z0-9_\S]*)\t([a-zA-Z0-9_\S]*)\t([a-zA-Z0-9_\S]*)\t([a-zA-Z0-9_\S]*)\t([a-zA-Z0-9_\S]*)\t([a-zA-Z_\S]*)\t([0-9]*)\t([0-9]*)\t([a-zA-Z_\S]*)\t([a-zA-Z_\S]*)\t([a-zA-Z_\S ]*)\t([a-zA-Z_\S ]*)\t([a-zA-Z_\S ]*)\t([a-zA-Z_\S ]*)\t([a-zA-Z_\S ]*)",
"output.format.string" = "%1$s %2$s %3$s %4$s %5$s %6$s %7$s %8$s %9$s %10$s %11$s %12$s %13$s %14$s %15$s %16$s %17$s %18$s %19$s"
)
location 's3n://infinilog/cogserver_hive.log';
然后加载数据:
load data inpath 's3n://infinilog/cogserver.log' into table coglogs;
我检查了regex rubular.com,它似乎是正确的,但为什么我没有得到任何数据回Hive表?
下面是日志文件中的一行示例:
101.196.242.160:9300 11329 2016-01-27 18:35:14.132 -5 http-9300-297 caf 2047 2 Audit.dispatcher.caf Request Warning secure error - found userCapabilities
101.196.242.160:9300 11329 2016-01-27 18:35:14.195 -5 F3820773ADD59754DEAFAFDFA0D2C3F2CDDF715483446C99B16BFA8040D0DDA4 9ss8Csyswh28w8sGC4hhvvMy2hw9d48Cd42y92y8 http-9300-299 CM 6102 3 Audit.Other.cms.CM QUERY REPORT /Public Folders/1A Reporting/PT/Reports/Summary Dashboard Success
编辑:
所以表中有数据(但我看到很多行的所有列值都是“none”),但是当我运行这样的操作时:
select count(*)
from coglogs
where status = 'Failure'
或者任何疑问,我都会回复:
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:172)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:166)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:42)
... 14 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
... 17 more
Caused by: java.lang.RuntimeException: Map operator initialization failed
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:153)
... 22 more
Caused by: java.lang.NoSuchFieldError: stringTypeInfo
at org.apache.hadoop.hive.contrib.serde2.RegexSerDe.initialize(RegexSerDe.java:115)
at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:521)
at org.apache.hadoop.hive.ql.plan.PartitionDesc.getDeserializer(PartitionDesc.java:138)
at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:297)
at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:333)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:122)
... 22 more
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
1条答案
按热度按时间mm9b1k5b1#
在hive中,您必须在regex中使用“\”而不是“\”