在您的实现中是否有人重写了mapper run(context)方法？

guz6ccqo 于 2021-06-01 发布在 Hadoop

关注(0)|答案(1)|浏览(324)

https://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/mapper.html#method.summary
org.apache.hadoop.mapreduce.mapper的run（context）方法

a). Expert users can override this method for more complete control over the execution of the Mapper.

当前run（context）方法的默认行为是什么。
如果我重写run（context），根据文档会得到什么样的特殊控件？
在您的实现中是否有人重写了此方法？

hadoop Mapper

来源：https://stackoverflow.com/questions/44637869/is-anyone-overridden-mapper-runcontext-method-in-your-implementations

1条答案

按热度按时间

mnowg1ta1#

当前run（context）方法的默认行为是什么。
默认实现在mapper类的apache hadoop源代码中可见：

/**
 * Expert users can override this method for more complete control over the
 * execution of the Mapper.
 * @param context
 * @throws IOException
 */
public void run(Context context) throws IOException, InterruptedException {
  setup(context);
  try {
    while (context.nextKeyValue()) {
      map(context.getCurrentKey(), context.getCurrentValue(), context);
    }
  } finally {
    cleanup(context);
  }
}

总结一下：
呼叫 setup 用于一次性初始化。
遍历输入中的所有键值对。
将键和值传递给 map 方法实现。
呼叫 cleanup 一次性拆卸。
如果我重写run（context），根据文档会得到什么样的特殊控件？
默认实现总是遵循单个线程中的特定执行序列。重写这一点很少见，但它可能会为高度专业化的实现提供可能性，例如不同的线程模型或尝试合并冗余的键范围。
在您的实现中是否有人重写了此方法？
在apache hadoop代码库中，有两种覆盖： ChainMapper 允许将多个 Mapper 在单个Map任务中执行的类实现。覆盖 run 设置表示链的对象，并通过Map器链传递每个输入键/值对。 MultithreadedMapper 允许多线程执行另一个 Mapper 班级。那个 Mapper 类必须是线程安全的。覆盖 run 开始多个线程迭代输入键值对并将它们传递给底层 Mapper .

赞(0）回复(0）举报 2021-06-01

我来回答

在您的实现中是否有人重写了mapper run(context)方法？

1条答案

相关问题

热门标签

最新问答