我正在尝试在hadoop(2.6.1)上运行gridmix
我可以在Yarn上运行map reduce作业,并运行rumen来提取跟踪以进行模拟,但是我无法完成最后一步并运行gridmix。
如果我检查日志,我发现一个错误:
{"org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion"
{"taskid":"task_1449829849459_0042_m_000000","taskType":"MAP","attemptId":"attempt_1449829849459_0042_m_000000_0","finishTime":1449841158377,"hostname":"simo","port":56154,"rackname":"/default-rack","status":"FAILED",
"error":"Error:
java.lang.ClassNotFoundException:
org.apache.hadoop.tools.rumen.ResourceUsageMetrics
\n\tat java.net.URLClassLoader$1.run(URLClassLoader.java:366)
\n\tat java.net.URLClassLoader$1.run(URLClassLoader.java:355)
\n\tat java.security.AccessController.doPrivileged(Native Method)
\n\tat java.net.URLClassLoader.findClass(URLClassLoader.java:354)
\n\tat java.lang.ClassLoader.loadClass(ClassLoader.java:425)\n\tat sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
\n\tat java.lang.ClassLoader.loadClass(ClassLoader.java:358)
\n\tat java.lang.Class.getDeclaredConstructors0(Native Method)
\n\tat java.lang.Class.privateGetDeclaredConstructors(Class.java:2595)
\n\tat java.lang.Class.getConstructor0(Class.java:2895)
\n\tat java.lang.Class.getDeclaredConstructor(Class.java:2066)
\n\tat org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:125)
\n\tat org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:66)
\n\tat org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
\n\tat org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:372)
\n\tat org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:751)
\n\tat org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
\n\tat org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
\n\tat java.security.AccessController.doPrivileged(Native Method)
\n\tat javax.security.auth.Subject.doAs(Subject.java:415)
\n\tat rg.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
\n\tat org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
\n",
"counters":
{"org.apache.hadoop.mapreduce.jobhistory.JhCounters":
{"name":"COUNTERS",
"groups":
[{"name":"org.apache.hadoop.mapreduce.TaskCounter",
"displayName":"Map-Reduce Framework",
"counts":[{"name":"CPU_MILLISECONDS","displayName":"CPU time spent (ms)","value":0},
{"name":"PHYSICAL_MEMORY_BYTES","displayName":"Physical memory (bytes) snapshot","value":0},{"name":"VIRTUAL_MEMORY_BYTES","displayName":"Virtual memory (bytes) snapshot","value":0}]}]}},"clockSplits":[286,287,287,287,287,287,286,287,287,287,287,287],"cpuUsages":[0,0,0,0,0,0,0,0,0,0,0,0],"vMemKbytes":[0,0,0,0,0,0,0,0,0,0,0,0],"physMemKbytes":[0,0,0,0,0,0,0,0,0,0,0,0]}}}
这个错误对我来说很奇怪,我不太明白。
我使用以下命令行运行gridmix: bin/hadoop jar share/hadoop/tools/lib/hadoop-gridmix-2.6.1.jar iopath trace.json
以及 bin/hadoop classpath
给我:
$ bin/hadoop classpath
/home/simo/hadoop-2.6.1/conf:
/home/simo/hadoop-2.6.1/share/hadoop/common/lib/*:
/home/simo/hadoop-2.6.1/share/hadoop/common/*:
/home/simo/hadoop-2.6.1/share/hadoop/hdfs:
/home/simo/hadoop-2.6.1/share/hadoop/hdfs/lib/*:
/home/simo/hadoop-2.6.1/share/hadoop/hdfs/*:
/home/simo/hadoop-2.6.1/share/hadoop/yarn/lib/*:
/home/simo/hadoop-2.6.1/share/hadoop/yarn/*:
/home/simo/hadoop-2.6.1/share/hadoop/mapreduce/lib/*:
/home/simo/hadoop-2.6.1/share/hadoop/mapreduce/*:
/home/simo/hadoop-2.6.1/share/hadoop/tools/lib/: <- here is hadoop-rumen-2.6.1.jar
/home/simo/hadoop-2.6.1/share/hadoop/tools/lib/*:
/usr/lib/jvm/java-7-openjdk-amd64/lib/:
/usr/lib/jvm/java-1.7.0-openjdk-amd64/lib/tools.jar:
HADOOP_CLASSPATH
应该包括瘤胃的档案。
1条答案
按热度按时间jv4diomz1#
异常来自任务,而不是启动gridmix运行的命令。因此,hadoop类路径并不是这个问题的真正原因。你需要把瘤胃罐放到你的任务类路径上。
如果您拥有集群的管理权限,则可以将rumen jar(及其依赖项)添加到集群安装中,并确保yarn.application.classpath将其选中。
或者,您可以将其作为libjar提供,以便将其与您的作业一起发送到集群。您需要确保任务类路径中也包含了所有必需的依赖项。