配置单元运行时错误:Map本地工作耗尽内存

vngu2lb8  于 2021-06-26  发布在  Hive
关注(0)|答案(1)|浏览(427)

我试图加入两个兽人表在Hive,但我得到一个错误。以下是查询:

select t1.num as num, t1.product as Product, t2.value as OldValue, t1.value as NewValue from test_new t1 LEFT OUTER JOIN test_old t2 ON t1.num=t2.num and t1.product=t2.product where t2.value is NULL and t1.value is not NULL or t1.value<>t2.value;

错误:

2017-05-29 11:19:27,157 INFO  [main]: mr.ExecDriver (SessionState.java:printInfo(911)) - Execution log at: /tmp/alex/kaliamoorthya_20170529111919_6621dd64-7a5e-4411-abda-b28fddab8bdc.log
2017-05-29 11:19:27,320 INFO  [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(118)) - <PERFLOG method=deserializePlan from=org.apache.hadoop.hive.ql.exec.Utilities>
2017-05-29 11:19:27,321 INFO  [main]: exec.Utilities (Utilities.java:deserializePlan(953)) - Deserializing MapredLocalWork via kryo
2017-05-29 11:19:27,462 INFO  [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(158)) - </PERFLOG method=deserializePlan start=1496056767320 end=1496056767462 duration=142 from=org.apache.hadoop.hive.ql.exec.Utilities>
2017-05-29 11:19:27,472 INFO  [main]: mr.MapredLocalTask (SessionState.java:printInfo(911)) - 2017-05-29 11:19:27   Starting to launch local task to process map join;  maximum memory = 1908932608
2017-05-29 11:19:27,549 INFO  [main]: mr.MapredLocalTask (MapredLocalTask.java:initializeOperators(441)) - fetchoperator for t2 created
2017-05-29 11:19:27,550 INFO  [main]: exec.TableScanOperator (Operator.java:initialize(346)) - Initializing Self TS[0]
2017-05-29 11:19:27,550 INFO  [main]: exec.TableScanOperator (Operator.java:initializeChildren(419)) - Operator 0 TS initialized
2017-05-29 11:19:27,550 INFO  [main]: exec.TableScanOperator (Operator.java:initializeChildren(423)) - Initializing children of 0 TS
2017-05-29 11:19:27,550 INFO  [main]: exec.HashTableSinkOperator (Operator.java:initialize(458)) - Initializing child 1 HASHTABLESINK
2017-05-29 11:19:27,550 INFO  [main]: exec.HashTableSinkOperator (Operator.java:initialize(346)) - Initializing Self HASHTABLESINK[1]
2017-05-29 11:19:27,551 INFO  [main]: mapjoin.MapJoinMemoryExhaustionHandler (MapJoinMemoryExhaustionHandler.java:<init>(61)) - JVM Max Heap Size: 1908932608
2017-05-29 11:19:27,582 INFO  [main]: persistence.HashMapWrapper (HashMapWrapper.java:calculateTableSize(94)) - Key count from statistics is -1; setting map size to 100000
2017-05-29 11:19:27,582 INFO  [main]: exec.HashTableSinkOperator (Operator.java:initialize(394)) - Initialization Done 1 HASHTABLESINK
2017-05-29 11:19:27,582 INFO  [main]: exec.TableScanOperator (Operator.java:initialize(394)) - Initialization Done 0 TS
2017-05-29 11:19:27,582 INFO  [main]: mr.MapredLocalTask (MapredLocalTask.java:initializeOperators(461)) - fetchoperator for t2 initialized
2017-05-29 11:19:28,059 INFO  [main]: Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1174)) - mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
2017-05-29 11:19:28,062 INFO  [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(118)) - <PERFLOG method=OrcGetSplits from=org.apache.hadoop.hive.ql.io.orc.ReaderImpl>
2017-05-29 11:19:28,098 INFO  [main]: orc.OrcInputFormat (OrcInputFormat.java:generateSplitsInfo(961)) - FooterCacheHitRatio: 0/4
2017-05-29 11:19:28,098 INFO  [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(158)) - </PERFLOG method=OrcGetSplits start=1496056768062 end=1496056768098 duration=36 from=org.apache.hadoop.hive.ql.io.orc.ReaderImpl>
2017-05-29 11:19:28,209 INFO  [main]: orc.OrcRawRecordMerger (OrcRawRecordMerger.java:<init>(430)) - min key = null, max key = null
2017-05-29 11:19:28,209 INFO  [main]: orc.ReaderImpl (ReaderImpl.java:rowsOptions(526)) - Reading ORC rows from hdfs://nameservice1/user/hive/warehouse/alex_tmp.db/test_old/000000_0 with {include: [true, true, true, true], offset: 0, length: 9223372036854775807}
2017-05-29 11:19:28,646 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:28   Processing rows:    200000  Hashtable size: 199999  Memory usage:   130784248   percentage: 0.069
2017-05-29 11:19:28,708 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:28   Processing rows:    300000  Hashtable size: 299999  Memory usage:   159462144   percentage: 0.084
2017-05-29 11:19:28,784 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:28   Processing rows:    400000  Hashtable size: 399999  Memory usage:   207258624   percentage: 0.109
2017-05-29 11:19:28,843 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:28   Processing rows:    500000  Hashtable size: 499999  Memory usage:   235936520   percentage: 0.124
2017-05-29 11:19:28,903 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:28   Processing rows:    600000  Hashtable size: 599999  Memory usage:   274173712   percentage: 0.144
2017-05-29 11:19:28,965 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:28   Processing rows:    700000  Hashtable size: 699999  Memory usage:   312410896   percentage: 0.164
2017-05-29 11:19:29,059 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:29   Processing rows:    800000  Hashtable size: 799999  Memory usage:   359036720   percentage: 0.188
2017-05-29 11:19:29,126 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:29   Processing rows:    900000  Hashtable size: 899999  Memory usage:   397273912   percentage: 0.208
2017-05-29 11:19:29,196 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:29   Processing rows:    1000000 Hashtable size: 999999  Memory usage:   425951800   percentage: 0.223
2017-05-29 11:19:29,263 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:29   Processing rows:    1100000 Hashtable size: 1099999 Memory usage:   464188992   percentage: 0.243
2017-05-29 11:19:29,333 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:29   Processing rows:    1200000 Hashtable size: 1199999 Memory usage:   502426176   percentage: 0.263
2017-05-29 11:19:29,401 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:29   Processing rows:    1300000 Hashtable size: 1299999 Memory usage:   540663360   percentage: 0.283
2017-05-29 11:19:32,752 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:32   Processing rows:    1400000 Hashtable size: 1399999 Memory usage:   485809696   percentage: 0.254
2017-05-29 11:19:32,817 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:32   Processing rows:    1500000 Hashtable size: 1499999 Memory usage:   524582216   percentage: 0.275
2017-05-29 11:19:32,937 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:32   Processing rows:    1600000 Hashtable size: 1599999 Memory usage:   580131976   percentage: 0.304
2017-05-29 11:19:32,998 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:32   Processing rows:    1700000 Hashtable size: 1699999 Memory usage:   618904496   percentage: 0.324
2017-05-29 11:19:33,061 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:33   Processing rows:    1800000 Hashtable size: 1799999 Memory usage:   647983888   percentage: 0.339
2017-05-29 11:19:33,124 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:33   Processing rows:    1900000 Hashtable size: 1899999 Memory usage:   686756400   percentage: 0.36
2017-05-29 11:19:33,188 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:33   Processing rows:    2000000 Hashtable size: 1999999 Memory usage:   725528920   percentage: 0.38
2017-05-29 11:19:33,253 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:33   Processing rows:    2100000 Hashtable size: 2099999 Memory usage:   764301440   percentage: 0.40
2017-05-29 11:19:33,316 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:33   Processing rows:    2200000 Hashtable size: 2199999 Memory usage:   793380824   percentage: 0.416
2017-05-29 11:19:33,380 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:33   Processing rows:    2300000 Hashtable size: 2299999 Memory usage:   832153336   percentage: 0.436
2017-05-29 11:19:33,445 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:33   Processing rows:    2400000 Hashtable size: 2399999 Memory usage:   870925856   percentage: 0.456
2017-05-29 11:19:33,510 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:33   Processing rows:    2500000 Hashtable size: 2499999 Memory usage:   909698376   percentage: 0.477
2017-05-29 11:19:33,574 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:33   Processing rows:    2600000 Hashtable size: 2599999 Memory usage:   938777776   percentage: 0.492
2017-05-29 11:19:38,930 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:38   Processing rows:    2700000 Hashtable size: 2699999 Memory usage:   924140056   percentage: 0.484
2017-05-29 11:19:38,996 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:38   Processing rows:    2800000 Hashtable size: 2799999 Memory usage:   960610440   percentage: 0.503
2017-05-29 11:19:39,063 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:39   Processing rows:    2900000 Hashtable size: 2899999 Memory usage:   997080808   percentage: 0.522
2017-05-29 11:19:39,134 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:39   Processing rows:    3000000 Hashtable size: 2999999 Memory usage:   1033551200  percentage: 0.541
2017-05-29 11:19:39,203 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:39   Processing rows:    3100000 Hashtable size: 3099999 Memory usage:   1070021576  percentage: 0.561
2017-05-29 11:19:39,392 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:39   Processing rows:    3200000 Hashtable size: 3199999 Memory usage:   1140046400  percentage: 0.597
2017-05-29 11:19:39,456 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:39   Processing rows:    3300000 Hashtable size: 3299999 Memory usage:   1176516784  percentage: 0.616
2017-05-29 11:19:39,519 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:39   Processing rows:    3400000 Hashtable size: 3399999 Memory usage:   1212987168  percentage: 0.635
2017-05-29 11:19:39,583 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:39   Processing rows:    3500000 Hashtable size: 3499999 Memory usage:   1249457552  percentage: 0.655
2017-05-29 11:19:39,646 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:39   Processing rows:    3600000 Hashtable size: 3599999 Memory usage:   1285927936  percentage: 0.674
2017-05-29 11:19:39,710 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:39   Processing rows:    3700000 Hashtable size: 3699999 Memory usage:   1322398320  percentage: 0.693
2017-05-29 11:19:39,774 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:39   Processing rows:    3800000 Hashtable size: 3799999 Memory usage:   1358868704  percentage: 0.712
2017-05-29 11:19:39,837 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:39   Processing rows:    3900000 Hashtable size: 3899999 Memory usage:   1395339088  percentage: 0.731
2017-05-29 11:19:39,904 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:39   Processing rows:    4000000 Hashtable size: 3999999 Memory usage:   1431809456  percentage: 0.75
2017-05-29 11:19:39,973 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:39   Processing rows:    4100000 Hashtable size: 4099999 Memory usage:   1468279832  percentage: 0.769
2017-05-29 11:19:40,041 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:40   Processing rows:    4200000 Hashtable size: 4199999 Memory usage:   1504750200  percentage: 0.788
2017-05-29 11:19:40,113 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:40   Processing rows:    4300000 Hashtable size: 4299999 Memory usage:   1538933512  percentage: 0.806
2017-05-29 11:19:48,786 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:48   Processing rows:    4400000 Hashtable size: 4399999 Memory usage:   1496365384  percentage: 0.784
2017-05-29 11:19:48,850 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:48   Processing rows:    4500000 Hashtable size: 4499999 Memory usage:   1532580448  percentage: 0.803
2017-05-29 11:19:48,915 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:48   Processing rows:    4600000 Hashtable size: 4599999 Memory usage:   1568795512  percentage: 0.822
2017-05-29 11:19:48,979 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:48   Processing rows:    4700000 Hashtable size: 4699999 Memory usage:   1605010584  percentage: 0.841
2017-05-29 11:19:49,044 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:49   Processing rows:    4800000 Hashtable size: 4799999 Memory usage:   1641225648  percentage: 0.86
2017-05-29 11:19:49,108 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:49   Processing rows:    4900000 Hashtable size: 4899999 Memory usage:   1677440712  percentage: 0.879
2017-05-29 11:19:49,171 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:49   Processing rows:    5000000 Hashtable size: 4999999 Memory usage:   1713655784  percentage: 0.898
2017-05-29 11:19:49,235 INFO  [main]: exec.HashTableSinkOperator (SessionState.java:printInfo(911)) - 2017-05-29 11:19:49   Processing rows:    5100000 Hashtable size: 5099999 Memory usage:   1749870856  percentage: 0.917
2017-05-29 11:19:49,246 ERROR [main]: mr.MapredLocalTask (MapredLocalTask.java:executeInProcess(354)) - Hive Runtime Error: Map local work exhausted memory
org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 2017-05-29 11:19:49    Processing rows:    5100000 Hashtable size: 5099999 Memory usage:   1749870856  percentage: 0.917
    at org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler.checkMemoryStatus(MapJoinMemoryExhaustionHandler.java:99)
    at org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.processOp(HashTableSinkOperator.java:249)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
    at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
    at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:409)
    at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:380)
    at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeInProcess(MapredLocalTask.java:346)
    at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:743)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

我试图设置Map内存,并减少内存到22000也仍然没有运气。在网上搜索后,我发现有人建议 hive.auto.convert.join = false 属性来克服上述错误,我的查询开始运行。
我不确定以这种方式运行查询是否会获得任何性能。表演还会一样吗?我们还有别的办法来解决这个问题吗?请给我一些关于提高查询性能的建议。

lb3vh1jj

lb3vh1jj1#

第一个也是最安全的选项是将hive.auto.convert.join设置为false。这样会降低一些性能,因为您将无法从mapjoin中获益。但这完全取决于你的用例和你的数据大小,这个妥协有多大。另一个选项是使用hive.auto.convert.join.noconditionaltask.size选项,该选项根据https://cwiki.apache.org/confluence/display/hive/languagemanual+joinoptimization “允许用户控制表在内存中的大小”找到合适的阈值可能是一个挑战。
p、 请记住,要使hive.auto.convert.join.noconditionaltask.size生效,hive.auto.convert.join.noconditionaltask需要为true(默认情况下为true)。

相关问题