hadoop map reduce索引超出界限

d6kp6zgx 于 2021-06-02 发布在 Hadoop

关注(0)|答案(2)|浏览(478)

对于较少的输入，我的程序运行得很好，但是当我增加输入的大小时，似乎第210行（context.nextkeyvalue（）；）引发indexoutofbounds异常。下面是Map器的设置方法。我在其中调用nextkeyvalue一次，因为每个文件的第一行是一个头。由于标头的原因，拆分文件设置为false。这和记忆有关吗？如何解决这个问题？
此外，下面的错误消息显示68次，即使我已将maxmapattempt设置为3。顺便说一下，一共有55个裂缝。是不是应该显示55次或者55*3？或者只有3个？它是如何工作的？

@Override
    protected void setup(Context context) throws IOException, InterruptedException
    {
        Configuration conf = context.getConfiguration();
        DupleSplit fileSplit = (DupleSplit)context.getInputSplit();
        //first line is header. Indicates the first digit of the solution. 
        context.nextKeyValue(); <---- LINE 210
        URI[] uris = context.getCacheFiles();

        int num_of_colors = Integer.parseInt(conf.get("num_of_colors"));
        int order = fileSplit.get_order();
        int first_digit = Integer.parseInt(context.getCurrentValue().toString());

        //perm_path = conf.get(Integer.toString(num_of_colors - order -1));
        int offset = Integer.parseInt(conf.get(Integer.toString(num_of_colors - order -1)));
        uri = uris[offset];
        Path perm_path = new Path(uri.getPath());
            perm_name = perm_path.getName().toString();

        String pair_variables = "";
        for (int i=1; i<=num_of_colors; i++)
            pair_variables += "X_" + i + "_" + (num_of_colors - order) + "\t";
        for (int i=1; i<num_of_colors; i++)
            pair_variables += "X_" + i + "_" + (num_of_colors - order - first_digit) + "\t";
        pair_variables += "X_" + num_of_colors + "_" + (num_of_colors - order - first_digit);
        context.write(new Text(pair_variables), null);
    }

错误日志如下：

Error: java.lang.IndexOutOfBoundsException
at java.nio.Buffer.checkBounds(Buffer.java:559)
at java.nio.ByteBuffer.get(ByteBuffer.java:668)
at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:279)
at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:168)
at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:775)
at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:831)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:891)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:934)
at java.io.DataInputStream.read(DataInputStream.java:149)
at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.fillBuffer(UncompressedSplitLineReader.java:59)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.readLine(UncompressedSplitLineReader.java:91)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.skipUtfByteOrderMark(LineRecordReader.java:144)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:184)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at produce_data_hdfs$input_mapper.setup(produce_data_hdfs.java:210)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

hadoop mapreduce IndexOutOfBoundsException

来源：https://stackoverflow.com/questions/38284800/hadoop-map-reduce-index-out-of-bounds

2条答案

按热度按时间

5jdjgkvh1#

我以前从未见过调用此方法，而且似乎您甚至不需要它，因为您不将其结果存储在任何变量中。
为什么不跳过map（）方法中的第一个键、值对呢？你可以很容易地做到这一点，有一个计数器，初始化为0从设置方法和增加它在Map的开始。然后，当此计数器等于1时，跳过Map计算：

int counter;

setup(){
   counter = 0;
   ...
}

map() {
    if (++counter == 1) {
        return;
    }
    ... //your existing map code goes here
}

错误消息显示了68次，可能是因为对于可以同时运行的每个Map任务（与集群中可用的Map槽数量相同）显示一次，然后重新执行这些任务（每个任务两次），直到其中一些任务失败，导致整个作业失败（在整个作业失败之前可以失败的任务数有一个阈值）。

赞(0）回复(0）举报 2021-06-02

j8yoct9x2#

我知道这已经晚了几年，但是对于任何看到这一点的人来说，hadoop2.6有一个从long到int的不安全类型转换，这在很多情况下导致了ioob异常。我相信这个补丁是2.7.3版本发布的。你可以在网上看到https://issues.apache.org/jira/browse/mapreduce-6635. 我希望这能帮助任何遇到这个问题的人。

赞(0）回复(0）举报 2021-06-02

我来回答

hadoop map reduce索引超出界限

2条答案

相关问题

热门标签

最新问答