如何在配置单元中对视图运行查询?

pqwbnv8z  于 2021-05-31  发布在  Hadoop
关注(0)|答案(1)|浏览(435)

我们在视图上运行一个简单的select(其中包含大量数据),我们得到了“gc开销超过限制,内存不足错误。我们希望运行此查询,以便在此视图上运行的报表可以工作。它在泰兹身上运行。
查询运行4个多小时后失败。有没有什么方法可以运行这个查询,比如一些设置选项?
查询

select * from inc_cts.v_report_pub_view;

错误消息-

TaskAttempt 0 failed, info=
» Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Reduce operator initialization failed
  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
  at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
  at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
  at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
  at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
  at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
  at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Reduce operator initialization failed
  at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:204)
  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
  ... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
  at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:389)
  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379)
  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482)
  at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)
  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
  at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:182)
  ... 15 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
  at java.util.concurrent.FutureTask.get(FutureTask.java:192)
  at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:387)
  ... 20 more
Caused by: java.lang.OutOfMemoryError: Java heap space
  at org.apache.hadoop.hive.ql.exec.persistence.FlatRowContainer.listRealloc(FlatRowContainer.java:259)
  at org.apache.hadoop.hive.ql.exec.persistence.FlatRowContainer.add(FlatRowContainer.java:86)
  at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.putRow(HashMapWrapper.java:133)
  at org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(HashTableLoader.java:211)
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:310)
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:179)
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:175)
  at org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:75)
  at org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:92)
  ... 4 more

TaskAttempt 1 killed
TaskAttempt 2 killed
TaskAttempt 3 failed, info=
» Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Reduce operator initialization failed
  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
  at or
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
  at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:389)
  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379)
  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482)
  at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)
  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
  at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:182)
  ... 15 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
  at java.util.concurrent.FutureTask.get(FutureTask.java:192)
  at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:387)
  ... 20 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
  at org.apache.hadoop.hive.ql.exec.persistence.FlatRowContainer.listRealloc(FlatRowContainer.java:259)
  at org.apache.hadoop.hive.ql.exec.persistence.FlatRowContainer.add(FlatRowContainer.java:86)
  at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.putRow(HashMapWrapper.java:133)
  at org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(HashTableLoader.java:211)
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:310)
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:179)
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:175)
  at org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:75)
  at org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:92)
  ... 4 more

TaskAttempt 4 killed
TaskAttempt 5 failed, info=
» Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Reduce operator initialization failed
  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
  at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
  at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
  at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
  at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
  at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
  at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Reduce operator initialization failed
  at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:204)
  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
  ... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
  at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:389)
  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379)
  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482)
  at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)
  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
  at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:182)
  ... 15 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
  at java.util.concurrent.FutureTask.get(FutureTask.java:192)
  at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:387)
  ... 20 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
  at org.apache.hadoop.hive.ql.exec.persistence.FlatRowContainer.listRealloc(FlatRowContainer.java:259)
  at org.apache.hadoop.hive.ql.exec.persistence.FlatRowContainer.add(FlatRowContainer.java:86)
  at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.putRow(HashMapWrapper.java:133)
  at org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(HashTableLoader.java:211)
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:310)
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:179)
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:175)
  at org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:75)
  at org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:92)
  ... 4 more

TaskAttempt 6 failed, info=
» Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Reduce operator initialization failed
  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
  at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
  at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
  at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
  at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
  at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
  at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Reduce operator initialization failed
  at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:204)
  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
  ... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
  at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:389)
  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379)
  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482)
  at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)
  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
  at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:182)
  ... 15 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
  at java.util.concurrent.FutureTask.get(FutureTask.java:192)
  at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:387)
  ... 20 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
  at org.apache.hadoop.hive.ql.exec.persistence.FlatRowContainer.listRealloc(FlatRowContainer.java:259)
  at org.apache.hadoop.hive.ql.exec.persistence.FlatRowContainer.add(FlatRowContainer.java:86)
  at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.putRow(HashMapWrapper.java:133)
  at org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(HashTableLoader.java:211)
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:310)
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:179)
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:175)
  at org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:75)
  at org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:92)
  ... 4 more
bmp9r5qi

bmp9r5qi1#

根据日志,例外是 OutOfMemoryError: GC overhead limit exceeded 在mapjoin hashtableloader中。
检查当前设置并相应增加:

set hive.tez.container.size=4096MB; 
set hive.auto.convert.join.noconditionaltask.size=1370MB --recommended one third of container size

尝试使用内存优化哈希表:

set hive.mapjoin.optimized.hashtable=true;
set hive.mapjoin.optimized.hashtable.wbsize=10485760; --Default Value (10 * 1024 * 1024)
--Optimized hashtable uses a chain of buffers to store data. This is one buffer size.

最后,如果没有任何帮助,可以关闭mapjoin:

set hive.auto.convert.join=false;

相关问题