hadoop群集在reduce>copy>

uxh89sit  于 2021-06-03  发布在  Hadoop
关注(0)|答案(3)|浏览(369)

到目前为止,对于这个问题,我已经从这里尝试了解决方案,1,这里,2。然而,虽然这些解决方案确实会导致mapreduce任务被执行,但它们似乎只在name节点上运行,因为我得到的输出类似于这里的3。
基本上,我用自己设计的mapreduce算法运行一个2节点集群。mapreducejar是在单节点集群上完美执行的,这让我觉得hadoop多节点配置有问题。为了设置多节点,我遵循了这里的教程。
为了报告发生了什么错误,当我执行程序时(在检查namenodes、tasktrackers、jobtrackers和datanodes是否在各自的节点上运行之后),我的程序在terminal中的这一行停止: INFO mapred.JobClient: map 100% reduce 0% 如果我看一下我看到的任务的日志 copy failed: attempt... from slave-node 接着是一个 SocketTimeoutException .
查看我的从属节点(datanode)上的日志可以发现执行在以下行停止: TaskTracker: attempt... 0.0% reduce > copy > 正如链接1和2中的解决方案所建议的,从 etc/hosts 文件会导致成功执行,但我最终会在从属节点(datanode)日志的链接4中找到一些项,例如:
INFO org.apache.hadoop.mapred.TaskTracker: Received 'KillJobAction' for job: job_201201301055_0381 WARN org.apache.hadoop.mapred.TaskTracker: Unknown job job_201201301055_0381 being deleted. 作为一个新的hadoop用户,我觉得这很可疑,但看到这一点可能很正常。对我来说,这看起来好像有什么东西指向hosts文件中不正确的ip地址,通过删除这个ip地址,我只是在从属节点上停止执行,而在namenode上继续处理(这一点都不有利)。
综上所述:
是否需要此输出?
有没有一种方法可以让我在执行后看到在哪个节点上执行了什么?
有人能看出我做错了什么吗?

为每个节点编辑添加的主机和配置文件

主机:etc/hosts

127.0.0.1       localhost
127.0.1.1       joseph-Dell-System-XPS-L702X

# The following lines are for hadoop master/slave setup

192.168.1.87    master
192.168.1.74    slave

# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

从属:etc/hosts

127.0.0.1       localhost
127.0.1.1       joseph-Home # this line was incorrect, it was set as 7.0.1.1

# the following lines are for hadoop mutli-node cluster setup

192.168.1.87    master
192.168.1.74    slave

# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

母版:core-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
    <name>hadoop.tmp.dir</name>
    <value>/home/hduser/tmp</value>
    <description>A base for other temporary directories.</description>
</property>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://master:54310</value>
        <description>The name of the default file system. A URI whose
        scheme and authority determine the FileSystem implementation. The
        uri’s scheme determines the config property (fs.SCHEME.impl) naming
        the FileSystem implementation class. The uri’s authority is used to
        determine the host, port, etc. for a filesystem.</description>
    </property>
</configuration>

从属:core-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

    <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/hduser/tmp</value>
        <description>A base for other temporary directories.</description>
    </property>

    <property>
        <name>fs.default.name</name>
        <value>hdfs://master:54310</value>
        <description>The name of the default file system. A URI whose
        scheme and authority determine the FileSystem implementation. The
        uri’s scheme determines the config property (fs.SCHEME.impl) naming
        the FileSystem implementation class. The uri’s authority is used to
        determine the host, port, etc. for a filesystem.</description>
    </property>

</configuration>

主文件:hdfs-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>2</value>
        <description>Default block replication.
        The actual number of replications can be specified when the file is created.
        The default is used if replication is not specified in create time.
        </description>
    </property>
</configuration>

从属:hdfs-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>2</value>
        <description>Default block replication.
        The actual number of replications can be specified when the file is created.
        The default is used if replication is not specified in create time.
        </description>
    </property>
</configuration>

母版:mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
    <property>
        <name>mapred.job.tracker</name>
        <value>master:54311</value>
        <description>The host and port that the MapReduce job tracker runs
        at. If “local”, then jobs are run in-process as a single map
        and reduce task.
        </description>
    </property>
</configuration>

从属:mapre-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

    <property>
        <name>mapred.job.tracker</name>
        <value>master:54311</value>
        <description>The host and port that the MapReduce job tracker runs
        at. If “local”, then jobs are run in-process as a single map
        and reduce task.
        </description>
    </property>

</configuration>
kmpatx3s

kmpatx3s1#

我今天也遇到了这个问题。在我的例子中,问题是集群中一个节点的磁盘已满,因此hadoop无法将日志文件写入本地磁盘,因此可以通过删除本地磁盘上一些未使用的文件来解决此问题。希望有帮助

k4aesqcs

k4aesqcs2#

错误在etc/hosts中:
在错误运行期间,从属etc/hosts文件如下所示:

127.0.0.1       localhost
7.0.1.1       joseph-Home # THIS LINE IS INCORRECT, IT SHOULD BE 127.0.1.1

# the following lines are for hadoop mutli-node cluster setup

192.168.1.87    master
192.168.1.74    slave

# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

您可能已经发现,这台计算机“joseph home”的ip地址配置不正确。当它应该被设置为127.0.1.1时,它被设置为7.0.1.1。因此,将slave etc/hosts文件第2行更改为 127.0.1.1 joseph-Home 修复了这个问题,并且我的日志通常显示在从属节点上。
新的etc/hosts文件:

127.0.0.1       localhost
127.0.1.1       joseph-Home # THIS LINE IS INCORRECT, IT SHOULD BE 127.0.1.1

# the following lines are for hadoop mutli-node cluster setup

192.168.1.87    master
192.168.1.74    slave

# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
n6lpvg4x

n6lpvg4x3#

测试的解决方案是将以下属性添加到hadoop-env.sh并重新启动所有hadoop集群服务
hadoop-env.sh文件
export hadoop\u client\u opts=“-xmx2048m$hadoop\u client\u opts”

相关问题