k-means迭代处理输出失败/clusters-2

r7s23pms  于 2021-06-04  发布在  Hadoop
关注(0)|答案(1)|浏览(471)

我刚学了几天hadoop,当我在hadoop中执行mahout的示例代码时,我得到以下错误:
线程“main”java.lang.interruptedexception中出现异常:k-means iteration无法处理org.apache.mahout.clustering.kmeans.kmeansdriver.runiteration(kmeansdriver)中的output/clusters-2。java:363)在org.apache.mahout.clustering.kmeans.kmeansdriver.buildclustersmr(kmeansdriver。java:310)在org.apache.mahout.clustering.kmeans.kmeansdriver.buildclusters(kmeansdriver。java:237)在org.apache.mahout.clustering.kmeans.kmeansdriver.run(kmeansdriver。java:152)在mia.chapter09.kmeansesample.main(kmeansesample。java:85)位于sun.reflect.nativemethodaccessorimpl.invoke0(本机方法)sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl。java:57)在sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl。java:43)在java.lang.reflect.method.invoke(方法。java:606)在org.apache.hadoop.util.runjar.main(runjar。java:212)
代码片段

Path path = new Path("testdata/clusters/part-00000");
SequenceFile.Writer writer = new SequenceFile.Writer(fs, conf,
    path, Text.class, Cluster.class);

for (int i = 0; i < k; i++) {
  Vector vec = vectors.get(i);
  Cluster cluster = new Cluster(vec, i, new EuclideanDistanceMeasure());
  writer.append(new Text(cluster.getIdentifier()), cluster);
}
writer.close();

KMeansDriver.run(conf, new Path("testdata/points"), new Path("testdata/clusters"),
  new Path("output"), new EuclideanDistanceMeasure(), 0.001, 10,
  true, false);

SequenceFile.Reader reader = new SequenceFile.Reader(fs,
    new Path("output/" + Cluster.CLUSTERED_POINTS_DIR
             + "/part-m-00000"), conf);

IntWritable key = new IntWritable();
WeightedVectorWritable value = new WeightedVectorWritable();
while (reader.next(key, value)) {
  System.out.println(value.toString() + " belongs to cluster "
                     + key.toString());
}
reader.close();
nx7onnlm

nx7onnlm1#

这将有助于指定u r使用的mahout版本,以及hadoop2.x或1.x等其他细节。
如果您使用的是mahout 0.7或更早版本,建议切换到mahout 0.9。

相关问题