Here are two exciting and significant additions to the Hadoop framework:
• HDFS Federation: provides a name service that is both scalable and reliable.
• YARN: Yet Another Resource Negotiator,it divides the two major functions of the JobTracker(resource management and life cycle management) into separate components.
Here are some of the articles
http://blog.cloudera.com/blog/2012/02/mapreduce-2-0-in-hadoop-0-23/
http://hortonworks.com/blog/introducing-apache-hadoop-yarn/
Hadoop 1.x is all about Map -reduce means you can run only map reduce but
YARN is more general than MR and it should be possible to run other computing models like BSP besides MR. Prior to YARN, it required a separate cluster for MR, BSP and others. Now they they can coexist in a single cluster, which leads to higher usage of the cluster. Here are some of the applications ported to YARN.
In the current system, JobTracker views the cluster as composed of nodes (managed by individual TaskTrackers) with distinct map slots and reduce slots, which are not fungible. Utilization issues occur because maps slots might be ‘full’ while reduce slots are empty (and vice-versa). Fixing this was necessary to ensure the entire system could be used to its maximum capacity for high utilization..
Also, it makes it possible to run different versions of Hadoop in the same cluster which is not possible with legacy MR, which makes is easy from a maintenance point.
2条答案
按热度按时间dgiusagp1#
我觉得mr1的主要困难是
需要全局共享状态的算法很难实现。
r55awzrz2#
hadoop1.x的一个关键问题是提供了一个高度安全的名称节点-‐可用。hdfs联邦不仅提供ha名称服务,而且还允许工作负载的分配,因为名称节点现在可以水平扩展。
yarn为在hadoop集群中协商和执行作业提供了一种逻辑上的职责分离-‐通用的资源管理框架,可用于不仅仅是map reduce作业。