容器的运行超出了物理内存限制

s5a0g9ez  于 2021-06-01  发布在  Hadoop
关注(0)|答案(2)|浏览(526)

我有一个mapreduce工作,处理1.4 tb的数据。在这样做时,我得到的错误如下。
拆分的数目是6444。在开始作业之前,我设置了以下设置:

conf.set("mapreduce.map.memory.mb", "8192");
conf.set("mapreduce.reduce.memory.mb", "8192");
conf.set("mapreduce.map.java.opts.max.heap", "8192");
conf.set("mapreduce.map.java.opts", "-Xmx8192m");
conf.set("mapreduce.reduce.java.opts", "-Xmx8192m");
conf.set("mapreduce.job.heap.memory-mb.ratio", "0.8");
conf.set("mapreduce.task.timeout", "21600000");

错误:

2018-05-18 00:50:36,595 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1524473936587_2969_m_004719_3: Container [pid=11510,containerID=container_1524473936587_2969_01_004894] is running beyond physical memory limits. Current usage: 8.1 GB of 8 GB physical memory used; 8.8 GB of 16.8 GB virtual memory used. Killing container.
    Dump of the process-tree for container_1524473936587_2969_01_004894 :
        |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
        |- 11560 11510 11510 11510 (java) 14960 2833 9460879360 2133706 /usr/lib/jvm/java-7-oracle-cloudera/bin/java
-Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx8192m -Djava.io.tmpdir=/sdk/7/yarn/nm/usercache/administrator/appcache/application_1524473936587_2969/container_1524473936587_2969_01_004894/tmp
-Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1524473936587_2969/container_1524473936587_2969_01_004894
-Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 10.106.79.75 41869 attempt_1524473936587_2969_m_004719_3 4894 
        |- 11510 11508 11510 11510 (bash) 0 0 11497472 679 /bin/bash -c /usr/lib/jvm/java-7-oracle-cloudera/bin/java
-Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  -Xmx8192m -Djava.io.tmpdir=/sdk/7/yarn/nm/usercache/administrator/appcache/application_1524473936587_2969/container_1524473936587_2969_01_004894/tmp
-Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1524473936587_2969/container_1524473936587_2969_01_004894

-Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 10.106.79.75 41869 attempt_1524473936587_2969_m_004719_3 4894 1>/var/log/hadoop-yarn/container/application_1524473936587_2969/container_1524473936587_2969_01_004894/stdout 2>/var/log/hadoop-yarn/container/application_1524473936587_2969/container_1524473936587_2969_01_004894/stderr

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

任何帮助都将不胜感激!

wa7juj8i

wa7juj8i1#

设置mapreduce.map.memory.mb将设置运行Map程序的容器的物理内存大小(mapreduce.reduce.memory.mb将对reducer容器执行相同的操作)。
请确保同时调整堆值。在较新版本的yarn/mrv2中,可以使用mapreduce.job.heap.memory-mb.ratio设置自动调整。默认值是.8,因此容器大小的80%将被分配为堆。否则,请使用mapreduce.map.java.opts.max.heap和mapreduce.reduce.java.opts.max.heap设置手动调整。
顺便说一句,我认为1 gb是默认值,它是相当低的。我建议阅读下面的链接。它提供了对yarn和mr内存设置、它们之间的关系以及如何基于集群节点大小(磁盘、内存和核心)设置一些基线设置的良好理解。
参考文献:http://community.cloudera.com/t5/cloudera-manager-installation/error-is-running-beyond-physical-memory-limits/td-p/55173

1yjd4xko

1yjd4xko2#

尝试设置内存分配限制:

SET yarn.scheduler.maximum-allocation-mb=16G;
SET yarn.scheduler.minimum-allocation-mb=8G;

您可以在此处查找其他Yarn设置:https://www.ibm.com/support/knowledgecenter/stxkqy_bda_shr/bl1bda_tuneyarn.htm

相关问题