AmazonWeb服务—字数计算作业挂起在hadoop中:编译、提交、接受并且永不终止

gupuwyp2  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(388)

在awsec2上成功地配置了hadoop集群,至少到了发布 jps 对每种类型的节点执行的命令都会产生以下输出:

6544 ResourceManager
4305 JobHistoryServer
7004 Jps
6252 NameNode

类似地:

2753 NodeManager
2614 DataNode
3051 Jps

按照创建wordcount程序的标准apache教程,我已经完成了所有的必备步骤,编译了java类以及 .jar ,如本文所述。
但是,当我用以下命令执行程序时:

$HADOOP_HOME/bin/hadoop jar wc.jar WordCount /user/wordcount /user/output2

作业挂起时,我的控制台上有以下输出:

管理web界面显示以下信息:

也许这和我的工作有关 yarn ?
在创建这个环境时,我基本上遵循了本教程。
下面是我如何安排配置文件的: yarn-site.xml :

<configuration>
    <property>
        <name>yarn.scheduler.minimum-allocation-mb</name>
        <value>128</value>
        <description>Minimum limit of memory to allocate to each container request at the Resource Manager.</description>
    </property>
    <property>
        <name>yarn.scheduler.maximum-allocation-mb</name>
        <value>2048</value>
        <description>Maximum limit of memory to allocate to each container request at the Resource Manager.</description>
    </property>
    <property>
        <name>yarn.scheduler.minimum-allocation-vcores</name>
        <value>1</value>
        <description>The minimum allocation for every container request at the RM, in terms of virtual CPU cores. Requests lower than this won't take effect, and the specified value will get allocated the minimum.</description>
    </property>
    <property>
        <name>yarn.scheduler.maximum-allocation-vcores</name>
        <value>2</value>
        <description>The maximum allocation for every container request at the RM, in terms of virtual CPU cores. Requests higher than this won't take effect, and will get capped to this value.</description>
    </property>
    <property>
        <name>yarn.nodemanager.resource.memory-mb</name>
        <value>4096</value>
        <description>Physical memory, in MB, to be made available to running containers</description>
    </property>
    <property>
        <name>yarn.nodemanager.resource.cpu-vcores</name>
        <value>4</value>
        <description>Number of CPU cores that can be allocated for containers.</description>
    </property>
</configuration>
``` `mapred-site.xml` :

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

adding support for jre

export PATH=$PATH:$JAVA_HOME/jre/bin
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
export CLASSPATH=$CLASSPATH:/usr/local/hadoop/lib/*:.

trying to get datanode to work :/

export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

export HADOOP_OPTS="$HADOOP_OPTS -Djava.security.egd=file:/dev/../dev/urandom"

ukdjmx9f

ukdjmx9f1#

确保删除此处的所有文件夹:

/usr/local/hadoop_work/hdfs/namenode/
/usr/local/hadoop_work/hdfs/datanode
/usr/local/hadoop_work/hdfs/namesecondary

通常只需要按照 rm -rf current/ .
相应配置:
yarn-site.xml文件

<configuration>
  <property>
     <name>yarn.nodemanager.aux-services</name>
     <value>mapreduce_shuffle</value>
  </property>
  <property>
     <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
     <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>
  <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>master</value>
  </property>
</configuration>

结果发现设置 yarn.resourcemanager.hostname 很重要,这让我有一阵子受挫:/
core-site.xml文件

<configuration>
  <property>
     <name>fs.defaultFS</name>
     <value>hdfs://master:9000</value>
  </property>
</configuration>

mapred-site.xml文件

<configuration>
  <property>
     <name>mapreduce.framework.name</name>
     <value>yarn</value>
  </property>
</configuration>

hdfs-site.xml文件

<configuration>
  <property>
     <name>dfs.replication</name>
     <value>1</value>
  </property>
  <property>
     <name>dfs.namenode.name.dir</name>
     <value>file:/usr/local/hadoop_work/hdfs/namenode</value>
  </property>
  <property>
    <name>dfs.namenode.checkpoint.dir</name>
    <value>file:/usr/local/hadoop_work/hdfs/namesecondary</value>
  </property>
  <property>
     <name>dfs.datanode.data.dir</name>
     <value>file:/usr/local/hadoop_work/hdfs/datanode</value>
  </property>
  <property>
    <name>dfs.secondary.http.address</name>
    <value>172.31.46.85:50090</value>
  </property>
</configuration>

/etc/主机

666.13.46.70  master
666.13.35.80  slave1
666.13.43.131 slave2

基本上,你想看看这个:

执行命令。。。
非常简单的教程:

hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar wordcount /input /output

对于本例:

$HADOOP_HOME/bin/hadoop jar wc.jar WordCount /input /output

相关问题