我的mr-jobs在hadoop上运行得非常慢。有些任务在reduce任务中被阻塞,增加任务完成超时也无济于事。
输入数据较少,拆分数为2
内存可用('free-m'显示每个节点的使用量低于4gb),每个节点有32gb
yarnchild消耗100%的cpu(使用“top”进行测试)
节点特征如下:
lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
CPU(s): 8
Thread(s) per core: 1
Core(s) per socket: 4
CPU socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 26
Stepping: 5
CPU MHz: 2260.925
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 8192K
下面是在奴隶的运行预览
自由-m
total used free shared buffers cached
Mem: 32235 1646 30588 0 19 721
-/+ buffers/cache: 905 31329
Swap: 3813 0 3813
顶部
top - 12:23:57 up 1:25, 1 user, load average: 0.15, 0.11, 0.03
Tasks: 151 total, 1 running, 150 sleeping, 0 stopped, 0 zombie
Cpu(s): 13.3%us, 0.1%sy, 0.0%ni, 86.6%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 33008756k total, 1686948k used, 31321808k free, 20020k buffers
Swap: 3905528k total, 0k used, 3905528k free, 739040k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2032 root 20 0 521m 220m 14m S 107 0.7 54:55.85 java
1564 root 20 0 1457m 136m 14m S 1 0.4 0:43.53 java
1 root 20 0 8356 816 684 S 0 0.0 0:02.63 init
2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd
3 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/0
4 root 20 0 0 0 0 S 0 0.0 0:00.02 ksoftirqd/0
5 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/0
6 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/1
7 root 20 0 0 0 0 S 0 0.0 0:00.00 ksoftirqd/1
8 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/1
9 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/2
10 root 20 0 0 0 0 S 0 0.0 0:00.01 ksoftirqd/2
11 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/2
12 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/3
13 root 20 0 0 0 0 S 0 0.0 0:00.00 ksoftirqd/3
14 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/3
15 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/4
日本
2032 YarnChild
1807 MRAppMaster
1564 NodeManager
2312 Jps
1469 DataNode
工作状态
Job: job_1395829029033_0003
Job File: master:9000/tmp/hadoop-yarn/staging/root/.staging/job_1395829029033_0003/job.xml
Job Tracking URL : /suno-40.sophia.grid5000.fr:80...29029033_0003/
Uber job : false
Number of maps: 2
Number of reduces: 1
map() completion: 1.0
reduce() completion: 0.6666668
Job state: RUNNING
retired: false
reason for failure:
Counters: 42
File System Counters
FILE: Number of bytes read=249437801
FILE: Number of bytes written=742321915
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=7591656
HDFS: Number of bytes written=0
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=1
Job Counters
Launched map tasks=2
Launched reduce tasks=1
Data-local map tasks=2
Total time spent by all maps in occupied slots (ms)=640596
Map-Reduce Framework
Map input records=87199
Map output records=6291068
Map output bytes=234057781
Map output materialized bytes=246718039
Input split bytes=216
Combine input records=0
Combine output records=0
Reduce input groups=1
Reduce shuffle bytes=246718039
Reduce input records=54768
Reduce output records=0
Spilled Records=12582136
Shuffled Maps =2
Failed Shuffles=0
Merged Map outputs=2
GC time elapsed (ms)=71456
CPU time spent (ms)=904320
Physical memory (bytes) snapshot=719138816
Virtual memory (bytes) snapshot=1675440128
Total committed heap usage (bytes)=628359168
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=7591440
File Output Format Counters
Bytes Written=0
工作状态过一会儿!
Job: job_1395829029033_0003master:9000/tmp/hadoop-yarn/staging/root/.staging/job_1395829029033_0003/job.xml
Job Tracking URL : ...8088/proxy/application_1395829029033_0003/
Uber job : false
Number of maps: 2
Number of reduces: 1
map() completion: 1.0
reduce() completion: 0.6666668
Job state: RUNNING
retired: false
reason for failure:
Counters: 42
File System Counters
FILE: Number of bytes read=249437801
FILE: Number of bytes written=742321915
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=7591656
HDFS: Number of bytes written=0
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=1
Job Counters
Launched map tasks=2
Launched reduce tasks=1
Data-local map tasks=2
Total time spent by all maps in occupied slots (ms)=640596
Map-Reduce Framework
Map input records=87199
Map output records=6291068
Map output bytes=234057781
Map output materialized bytes=246718039
Input split bytes=216
Combine input records=0
Combine output records=0
Reduce input groups=1
Reduce shuffle bytes=246718039
Reduce input records=54768
Reduce output records=0
Spilled Records=12582136
Shuffled Maps =2
Failed Shuffles=0
Merged Map outputs=2
GC time elapsed (ms)=71456
CPU time spent (ms)=904320
Physical memory (bytes) snapshot=719138816
Virtual memory (bytes) snapshot=1675440128
Total committed heap usage (bytes)=628359168
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=7591440
File Output Format Counters
Bytes Written=0
后者在减少67%后没有进展。感谢您的帮助,我无法找到提高性能的方法。
暂无答案!
目前还没有任何答案,快来回答吧!