在尝试运行配置单元查询以将数据插入配置单元外部表时,我遇到了一个问题。进程在reduce处失败。
诊断控制台消息如下:
Task with the most failures(4):
-----
Task ID:
task_201709171147_0059_r_000005
URL:
http://localhost:50030/taskdetails.jsp?jobid=job_201709171147_0059&tipid=task_201709171147_0059_r_000005
-----
Diagnostic Messages for this Task:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hive-gms/hive_2017-09-17_15-05-39_454_6420114297268603231/_task_tmp.-mr-10007/_tmp.000005_3 could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)
at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:270)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:522)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hive-gms/hive_2017-09-17_15-05-39_454_6420114297268603231/_task_tmp.-mr-10007/_tmp.000005_3 could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:620)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1061)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.flush(GroupByOperator.java:1027)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:866)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:751)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:819)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.createForwardJoinObject(CommonJoinOperator.java:392)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genObject(CommonJoinOperator.java:715)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genObject(CommonJoinOperator.java:697)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genObject(CommonJoinOperator.java:697)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genObject(CommonJoinOperator.java:697)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genObject(CommonJoinOperator.java:697)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genObject(CommonJoinOperator.java:697)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:851)
at org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:266)
at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:200)
... 7 more
Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hive-gms/hive_2017-09-17_15-05-39_454_6420114297268603231/_task_tmp.-mr-10007/_tmp.000005_3 could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)
at org.apache.hadoop.ipc.Client.call(Client.java:1113)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at $Proxy2.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
at $Proxy2.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3720)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3580)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2783)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3023)
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 60 Reduce: 16 Cumulative CPU: 1488.73 sec HDFS Read: 15611147485 HDFS Write: 6697774964 SUCCESS
Job 1: Map: 41 Reduce: 11 Cumulative CPU: 2231.76 sec HDFS Read: 10853652873 HDFS Write: 8893984114 SUCCESS
Job 2: Map: 31 Reduce: 10 Cumulative CPU: 2426.38 sec HDFS Read: 9165529191 HDFS Write: 8927047843 SUCCESS
Job 3: Map: 33 Reduce: 10 Cumulative CPU: 2469.67 sec HDFS Read: 9198570523 HDFS Write: 9003617537 SUCCESS
Job 4: Map: 33 Reduce: 10 Cumulative CPU: 2519.58 sec HDFS Read: 9275146392 HDFS Write: 9032825573 SUCCESS
Job 5: Map: 33 Reduce: 10 Cumulative CPU: 2607.76 sec HDFS Read: 9304360085 HDFS Write: 9107047528 SUCCESS
Job 6: Map: 34 Reduce: 10 Cumulative CPU: 2624.55 sec HDFS Read: 9378577538 HDFS Write: 15269385865 FAIL
Total MapReduce CPU Time Spent: 0 days 4 hours 32 minutes 48 seconds 430 msec
来自jobtracker web ui的日志消息如下:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hive-gms/hive_2017-09-17_15-05-39_454_6420114297268603231/_task_tmp.-mr-10007/_tmp.000004_0 could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)
at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:270)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:522)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hive-gms/hive_2017-09-17_15-05-39_454_6420114297268603231/_task_tmp.-mr-10007/_tmp.000004_0 could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:620)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1061)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:885)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:753)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:819)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.createForwardJoinObject(CommonJoinOperator.java:392)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genObject(CommonJoinOperator.java:715)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genObject(CommonJoinOperator.java:697)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genObject(CommonJoinOperator.java:697)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genObject(CommonJoinOperator.java:697)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genObject(CommonJoinOperator.java:697)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genObject(CommonJoinOperator.java:697)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:851)
at org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:266)
at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:200)
... 7 more
Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hive-gms/hive_2017-09-17_15-05-39_454_6420114297268603231/_task_tmp.-mr-10007/_tmp.000004_0 could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)
at org.apache.hadoop.ipc.Client.call(Client.java:1113)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at $Proxy2.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
at $Proxy2.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3720)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3580)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2783)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3023)
Last 4KB
Last 8KB
All
attempt_201709171147_0059_r_000005_0 task_201709171147_0059_r_000005 gmdatal01-p1.ilcb.tcprod.local FAILED
我已经为配置单元设置了动态分区:
set hive.exec.dynamic.partitions = true
set hive.exec.dynamic.partitions.mode = nonstrict
我还格式化了namenode,并再次尝试,但错误不会消失。
1条答案
按热度按时间fhg3lkii1#
问题的根本原因似乎与空间有关。我使用了一个连接几个大表的查询,由于数据是倾斜的,hive开始创建大的临时表,因为磁盘上没有空间。我将配置文件恢复为默认值,并压缩中间结果(配置文件)以获得结果。