无法stat'/user/hadoop/logs/datanode cluster

qvtsj1bj  于 2021-05-27  发布在  Hadoop
关注(0)|答案(0)|浏览(299)

我正在尝试运行一个多步骤作业,其中一个步骤作为使用pyspark/apachespark的脚本。我有一个带有slurm作业调度程序的4节点计算机集群,我想知道如何将它们一起运行。目前,我在所有节点上都有spark(head节点充当“master”,其余3个compute节点充当“slaves”)和hadoop(head节点充当namenode,secondary namenode,其余3个compute节点充当datanodes)。但是,当我用start-all.sh在head节点上启动hadoop时,我只看到一个datanode,当我尝试启动它时,会出现一个错误

localhost: mv: cannot stat '/user/hadoop/logs/datanode-cluster-n1.out.4': No such file or directory
      localhost: mv: cannot stat '/user/hadoop/logs/datanode-cluster-n1.out.3': No such file or directory
      localhost: mv: cannot stat '/user/hadoop/logs/datanode-cluster-n1.out.2': No such file or directory
      localhost: mv: cannot stat '/user/hadoop/logs/datanode-cluster-n1.out.1': No such file or directory
      localhost: mv: cannot stat '/user/hadoop/logs/datanode-cluster-n1.out': No such file or directory

但是,这些文件是存在的,并且看起来是可读写的。spark启动良好,3个从节点可以从头部节点启动。由于前面提到的错误,当我将作业提交给slurm时,它会抛出上面的错误。如果您能就这个问题和我的流程架构提供任何建议,我将不胜感激。
编辑1:hadoop配置文件
core-site.xml文件

<configuration>
  <property>
  <name>fs.default.name</name>
  <value>hdfs://cluster-hn:9000</value>
  </property>
  </configuration>

hdfs-site.xml文件

<configuration>
 <property>
 <name>dfs.replication</name>
 <value>1</value>
 </property>
 <property>
 <name>dfs.permission</name>
 <value>false</value>
 </property>

 <property>
 <name>dfs.namenode.name.dir</name>
 <value>/s1/snagaraj/hadoop/name</value>
 </property>
 <property>
        <name>dfs.datanode.data.dir</name>
        <value>/s1/snagaraj/hadoop/dataNode</value>
 </property>
 <property>
        <name>dfs.replication</name>
        <value>1</value>
 </property>
 <property>
 <name>dfs.https.port</name>
 <value>50470</value>
 <description>The https port where namenode binds</description>
 </property>

 <property>
 <name>dfs.socket.timeout</name>
 <value>0</value>
 </property>

工人档案

localhost
cluster-n1
cluster-n2
cluster-n3

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题