hadoop knn连接算法卡在Map上100%减少0%

lo8azlld  于 2021-05-30  发布在  Hadoop
关注(0)|答案(0)|浏览(274)

2011年6月15日10:31:51 info mapreduce.job:Map100%减少0%
我正在尝试在hadoop2.6.0上运行开源knn-join-mapreduce-hbrj算法,用于安装在笔记本电脑(osx)上的单节点集群伪分布式操作(可在以下位置找到来源:http://www.cs.utah.edu/~lifeifei/knnj/). 该算法由两个mapreduce阶段组成,其中第二阶段使用第一阶段的输出文件作为其输入。第一阶段的Map和减少成功-我也可以看看输出文件,一切似乎都是正确的。然而,当运行第二阶段的工作是说,成功地完成,即使它从来没有减少,甚至进入该阶段,我相信。
下面是我运行第2阶段时打印的内容(我包括了所有内容,希望它能有用)

2015-06-11 10:31:47.526 java[3918:305930] Unable to load realm info from SCDynamicStore
15/06/11 10:31:48 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/06/11 10:31:49 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
15/06/11 10:31:49 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
15/06/11 10:31:49 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
15/06/11 10:31:49 INFO mapred.FileInputFormat: Total input paths to process : 64
15/06/11 10:31:49 INFO mapreduce.JobSubmitter: number of splits:64
15/06/11 10:31:50 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1089761712_0001
15/06/11 10:31:50 INFO mapred.LocalJobRunner: OutputCommitter set in config null
15/06/11 10:31:50 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
15/06/11 10:31:50 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapred.FileOutputCommitter
15/06/11 10:31:50 INFO mapreduce.Job: Running job: job_local1089761712_0001
15/06/11 10:31:50 INFO mapred.LocalJobRunner: Waiting for map tasks
15/06/11 10:31:50 INFO mapred.LocalJobRunner: Starting task: attempt_local1089761712_0001_m_000000_0
15/06/11 10:31:50 INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree currently is supported only on Linux.
15/06/11 10:31:50 INFO mapred.Task:  Using ResourceCalculatorProcessTree : null
15/06/11 10:31:50 INFO mapred.MapTask: Processing split: hdfs://localhost:9000/user/sasha/hbrj/output/part-00042:0+872
15/06/11 10:31:50 INFO mapred.MapTask: numReduceTasks: 0
15/06/11 10:31:50 INFO mapred.LocalJobRunner: 
15/06/11 10:31:50 INFO mapred.Task: Task:attempt_local1089761712_0001_m_000000_0 is done. And is in the process of committing
15/06/11 10:31:50 INFO mapred.LocalJobRunner: 
15/06/11 10:31:50 INFO mapred.Task: Task attempt_local1089761712_0001_m_000000_0 is allowed to commit now
15/06/11 10:31:50 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1089761712_0001_m_000000_0' to hdfs://localhost:9000/user/sasha/hbrj/output2/_temporary/0/task_local1089761712_0001_m_000000
15/06/11 10:31:50 INFO mapred.MapTask: numReduceTasks: 0
15/06/11 10:31:50 INFO mapred.LocalJobRunner: 
15/06/11 10:31:50 INFO mapred.Task: Task:attempt_local1089761712_0001_m_000000_0 is done. And is in the process of committing
15/06/11 10:31:50 INFO mapred.LocalJobRunner: 
15/06/11 10:31:50 INFO mapred.Task: Task attempt_local1089761712_0001_m_000000_0 is allowed to commit now
15/06/11 10:31:50 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1089761712_0001_m_000000_0' to hdfs://localhost:9000/user/sasha/hbrj/output2/_temporary/0/task_local1089761712_0001_m_000000
15/06/11 10:31:50 INFO mapred.LocalJobRunner: hdfs://localhost:9000/user/sasha/hbrj/output/part-00042:0+872
15/06/11 10:31:50 INFO mapred.Task: Task 'attempt_local1089761712_0001_m_000000_0' done.

继续以这种方式直到。。。

15/06/11 10:31:51 INFO mapred.LocalJobRunner: Finishing task: attempt_local1089761712_0001_m_000012_0
15/06/11 10:31:51 INFO mapred.LocalJobRunner: Starting task: attempt_local1089761712_0001_m_000013_0
15/06/11 10:31:51 INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree currently is supported only on Linux.
15/06/11 10:31:51 INFO mapred.Task:  Using ResourceCalculatorProcessTree : null
15/06/11 10:31:51 INFO mapred.MapTask: Processing split: hdfs://localhost:9000/user/sasha/hbrj/output/part-00015:0+646
15/06/11 10:31:51 INFO mapred.MapTask: numReduceTasks: 0
15/06/11 10:31:51 INFO mapreduce.Job: Job job_local1089761712_0001 running in uber mode : false
15/06/11 10:31:51 INFO mapreduce.Job:  map 100% reduce 0%
15/06/11 10:31:51 INFO mapred.LocalJobRunner: 
15/06/11 10:31:51 INFO mapred.Task: Task:attempt_local1089761712_0001_m_000013_0 is done. And is in the process of committing
15/06/11 10:31:51 INFO mapred.LocalJobRunner: 
15/06/11 10:31:51 INFO mapred.Task: Task attempt_local1089761712_0001_m_000013_0 is allowed to commit now
15/06/11 10:31:51 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1089761712_0001_m_000013_0' to hdfs://localhost:9000/user/sasha/hbrj/output2/_temporary/0/task_local1089761712_0001_m_000013
15/06/11 10:31:51 INFO mapred.LocalJobRunner: hdfs://localhost:9000/user/sasha/hbrj/output/part-00015:0+646
15/06/11 10:31:51 INFO mapred.Task: Task 'attempt_local1089761712_0001_m_000013_0' done.
15/06/11 10:31:51 INFO mapred.LocalJobRunner: Finishing task: attempt_local1089761712_0001_m_000013_0
15/06/11 10:31:51 INFO mapred.LocalJobRunner: Starting task: attempt_local1089761712_0001_m_000014_0

正在启动任务。。。完成任务
以如下所示的方式重复(这是最后一个任务),并表示作业已成功完成:

15/06/11 10:31:53 INFO mapred.MapTask: numReduceTasks: 0
15/06/11 10:31:53 INFO mapred.LocalJobRunner: 
15/06/11 10:31:53 INFO mapred.Task: Task:attempt_local1089761712_0001_m_000063_0 is done. And is in the process of committing
15/06/11 10:31:53 INFO mapred.LocalJobRunner: 
15/06/11 10:31:53 INFO mapred.Task: Task attempt_local1089761712_0001_m_000063_0 is allowed to commit now
15/06/11 10:31:53 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1089761712_0001_m_000063_0' to hdfs://localhost:9000/user/sasha/hbrj/output2/_temporary/0/task_local1089761712_0001_m_000063
15/06/11 10:31:53 INFO mapred.LocalJobRunner: hdfs://localhost:9000/user/sasha/hbrj/output/part-00004:0+178
15/06/11 10:31:53 INFO mapred.Task: Task 'attempt_local1089761712_0001_m_000063_0' done.
15/06/11 10:31:53 INFO mapred.LocalJobRunner: Finishing task: attempt_local1089761712_0001_m_000063_0
15/06/11 10:31:53 INFO mapred.LocalJobRunner: map task executor complete.
15/06/11 10:31:54 INFO mapreduce.Job: Job job_local1089761712_0001 completed successfully
15/06/11 10:31:54 INFO mapreduce.Job: Counters: 20
    File System Counters
        FILE: Number of bytes read=96487226
        FILE: Number of bytes written=106993472
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=1157797
        HDFS: Number of bytes written=884212
        HDFS: Number of read operations=8576
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=4224
    Map-Reduce Framework
        Map input records=793
        Map output records=793
        Input split bytes=6848
        Spilled Records=0
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=42
        Total committed heap usage (bytes)=12124160000
    File Input Format Counters 
        Bytes Read=28599
    File Output Format Counters 
        Bytes Written=21848

到目前为止我所做的:

我在这里发现了一个类似的问题:hadoop wordcount示例停留在map 100%reduce 0%并遵循给出的一些建议。特别地:
有一次我配置了Yarn,这样我就可以进入localhost:8088 and 监视作业。所有的Map器工作正常-没有失败,作业在最后一个Map器成功之后突然结束,也就是说没有还原器启动。mapper显示为100%,reduce显示为0%。
使用此命令:

cat /path/to/logs/*.log | grep ERROR

什么也没回来。
我可以看到mapper阶段的输出,我相信问题并不存在。
我试过调试:将print语句放入reduce的configure方法和reduce方法本身。当我重新运行文件时,没有一个被打印出来。
附加说明:由于我使用的算法已经发布,应该可以工作,我相信问题可能是因为代码已经3年了,是为hadoop0.20.2版本编写的,但我想我不应该太确定这一点。
我知道这是一个具体的问题,但我希望有人能给我指出正确的方向。我很乐意把你认为有用的东西也包括进来**非常感谢您的帮助!

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题