在cdh3-hadoop0.20.2-cdh3u1上,我也遇到了错误“splitmetadata size exceeded 10000000”。在我的例子中,有两个输入inp1 size=1gb inp2 size=7MB
当我使用mapred.max.split.size=256mb时,它抛出以下错误。
Job initialization failed: java.io.IOException: Split metadata size exceeded 10000000. Aborting job job_201412112225_1046114 at org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:48) at org.apache.hadoop.mapred.JobInProgress.createSplits(JobInProgress.java:814) at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:708) at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:4016) at org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:79) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662)
当我更改mapred.max.split.size=8mb时,它会成功运行,但需要太多的Map程序。
相同配置的相同作业在cdh4.6上运行良好
任何关于解决这个问题的提示/建议。
1条答案
按热度按时间wd2eg0qa1#
对于cloudera,将“mapreduce.jobtracker.split.metainfo.maxsize”设置为-1就可以了。或者,您可能需要根据需要将“mapreduce.job.split.metainfo.maxsize”设置为-1https://hadoop.apache.org/docs/r2.4.1/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml.