我是新来ceph的。我有一个5节点的集群(ubuntu14.04),其中安装了hadoop(1.1.1)和ceph(v0.87)。我想用hadoop和cepfs一起运行一些实验。我用普通的hadoop设置运行了wordcount示例,效果很好。ceph集群运行状况也正常。但是当我更改hadoop配置时,如“将hadoop与cepfs结合使用”文档中所述http://ceph.com/docs/master/cephfs/hadoop/,我面临以下错误(我已在/mnt/phfs中安装了带有内核驱动程序的cepfs):
ceph@admin-node:/usr/local/hadoop-1.1.1$bin/hadoop-jar hadoop-examples-1.1.jar wordcount/mnt/mycephfs/wc input/mnt/mycephfs/wc-output-425
15/04/14 20:47:00 INFO util.NativeCodeLoader: Loaded the native-hadoop library
15/04/14 20:47:00 INFO input.FileInputFormat: Total input paths to process : 1
15/04/14 20:47:00 WARN snappy.LoadSnappy: Snappy native library not loaded
15/04/14 20:47:01 INFO mapred.JobClient: Running job: job_201504142046_0001
15/04/14 20:47:02 INFO mapred.JobClient: map 0% reduce 0%
15/04/14 20:47:03 INFO mapred.JobClient: Task Id : attempt_201504142046_0001_m_000021_0, Status : FAILED
Error initializing attempt_201504142046_0001_m_000021_0:
java.io.FileNotFoundException: File file:/app/hadoop/tmp/mapred/system/job_201504142046_0001/jobToken does not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
at org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:4445)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1272)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1213)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2568)
at java.lang.Thread.run(Thread.java:745)
15/04/14 20:47:03 WARN mapred.JobClient: Error reading task outputhttp://node2:50060/tasklog?plaintext=true&attemptid=attempt_201504142046_0001_m_000021_0&filter=stdout
15/04/14 20:47:03 WARN mapred.JobClient: Error reading task outputhttp://node2:50060/tasklog?plaintext=true&attemptid=attempt_201504142046_0001_m_000021_0&filter=stderr
15/04/14 20:47:03 INFO mapred.JobClient: Task Id : attempt_201504142046_0001_r_000002_0, Status : FAILED
Error initializing attempt_201504142046_0001_r_000002_0:
java.io.FileNotFoundException: File file:/app/hadoop/tmp/mapred/system/job_201504142046_0001/jobToken does not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
at org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:4445)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1272)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1213)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2568)
at java.lang.Thread.run(Thread.java:745)
15/04/14 20:47:03 WARN mapred.JobClient: Error reading task outputhttp://node3:50060/tasklog?plaintext=true&attemptid=attempt_201504142046_0001_m_000021_1&filter=stdout
15/04/14 20:47:03 WARN mapred.JobClient: Error reading task outputhttp://node3:50060/tasklog?plaintext=true&attemptid=attempt_201504142046_0001_m_000021_1&filter=stderr
15/04/14 20:47:04 INFO mapred.JobClient: Task Id : attempt_201504142046_0001_r_000002_1, Status : FAILED
Error initializing attempt_201504142046_0001_r_000002_1:
java.io.FileNotFoundException: File file:/app/hadoop/tmp/mapred/system/job_201504142046_0001/jobToken does not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
at org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:4445)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1272)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1213)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2568)
at java.lang.Thread.run(Thread.java:745)
.....................
使用cepfs而不是hdfs只需要Map的守护进程,因此节点中只运行jobtracker和tasktracker(1个jobtracker,4个tasktracker)。我的hadoop的core-site.xml文件:(删除hadoop.tmp.dir,正如在另一个问题中所建议的那样,并不能解决这个问题)
<configuration>
<property>
<name>fs.defaultFS</name>
<value>ceph://10.242.144.225:6789/</value>
</property>
<property>
<name>ceph.root.dir</name>
<value>/mnt/mycephfs</value>
</property>
<property>
<name>ceph.conf.file</name>
<value>/etc/ceph/ceph.conf</value>
</property>
<property>
<name>ceph.data.pools</name>
<value>data</value>
</property>
<property>
<name>fs.AbstractFileSystem.ceph.impl</name>
<value>org.apache.hadoop.fs.ceph.CephFs</value>
</property>
<property>
<name>fs.ceph.impl</name>
<value>org.apache.hadoop.fs.ceph.CephFileSystem</value>
</property>
</configuration>
mapred-site.xml是:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>10.242.144.212:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. Provide the ip address of your master node. The port number must be 54311 or 8021.
</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>ceph://10.242.144.225:6789/</value>
</property>
</configuration>
请让我知道我在哪里犯错。在这方面的任何帮助都是非常感谢的。
暂无答案!
目前还没有任何答案,快来回答吧!