apache hadoop 2.6 java堆空间错误

sczxawaw  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(263)

我得到:

15/04/27 09:28:04 INFO mapred.LocalJobRunner: map task executor complete.
15/04/27 09:28:04 WARN mapred.LocalJobRunner: job_local1576000334_0001
java.lang.Exception: java.lang.OutOfMemoryError: Java heap space
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:983)
    at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:401)
    at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:81)
    at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:695)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:767)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
15/04/27 09:28:05 INFO mapreduce.Job: Job job_local1576000334_0001 failed    with state FAILED due to: NA
15/04/27 09:28:05 INFO mapreduce.Job: Counters: 0
15/04/27 09:28:05 INFO terasort.TeraSort: done

使用具有以下配置的ApacheHadoop2.6。
mapreduce配置“mapred.site.xml”

<configuration>

<property>
<name>mapred.job.tracker</name>
<value>n1:54311</value>
</property>

<property>
<name>mapreduce.local.dir</name>
<value>/home/hadoop/hadoop/maptlogs</value>
</property>

<property>
<name>mapreduce.map.tasks</name>
<value>32</value>
</property>

<property>
<name>mapreduce.reduce.tasks</name>
<value>10</value>
</property>

<property>
<name>mapred.child.java.opts</name>
<value>-Xmx1024m</value>
</property>

<property>
<name>mapreduce.task.io.sort.mb</name>
<value>256</value>
<description>Added 04/27 @ 10:09am for testing</description>
</property>

</configuration>

和yarn-site.xml

<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>n1:8025</value>
</property>

<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>n1:8030</value>
</property>

<property>
<name>yarn.resourcemanager.address</name>
<value>n1:8050</value>
</property>

<property>
<name>yarn.nodemanager.disk-health-checker.enable</name>
<value>false</value>
</property>

<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>4096</value>
<description>Minimum limit of memory to allocate to each container request at the Resource Manager.</description>
</property>

<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>8192</value>
<description>Maximum limit of memory to allocate to each container request at the Resource Manager.</description>
</property>

<property>
<name>yarn.scheduler.minimum-allocation-vcores</name>
<value>1</value>
<description>The minimum allocation for every container request at the RM, in terms of virtual CPU cores. Requests lower than this won't take effect, and the specified value will get allocated the minimum.</description>
</property>

<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>2</value>
<description>The maximum allocation for every container request at the RM, in terms of virtual CPU cores. Requests higher than this won't take effect, and will get capped to this value.</description>
</property>

<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>96000</value>
<description>Physical memory, in MB, to be made available to running containers</description>
</property>

<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>32</value>
<description>Number of CPU cores that can be allocated for containers.</description>
</property>

我还添加了linux 90-nproc.conf,如下所示:


* soft    nproc     20000

root       soft    nproc     unlimited

* soft    nofile    20000
* hard    nofile    20000

root       soft    nofile    20000
root       hard    nofile    20000

但是我仍然在terasort上得到一个java堆空间错误。
我和teragen没有任何问题。
操作系统是
红帽6.6
内核3.18
11台机器
1名称节点
10个数据节点
apache hadoop 2.6版

ttisahbt

ttisahbt1#

在mapred-site.xml中指定的内存限制必须低于mapred-site.xml内存设置,然后使用系统资源计算。我使用一个脚本来收集系统细节,并创建我的核心站点mapred site hdfs site和yarn-site.xml配置。
注意:mapreduce运行在yarn之上,因此请记住始终将内存放在yarn-site.xml细节之下。现在,我可以用ApacheHadoop2.6的自动配置脚本在4分57秒内运行一个带有6台机器、1tb teragen的mapred。
我对apachehadoop的性能感到非常惊讶。

相关问题