我尝试在mapreduce中实现简单的分组。
我的输入文件如下:
7369,SMITH,CLERK,800,20
7499,ALLEN,SALESMAN,1600,30
7521,WARD,SALESMAN,1250,30
7566,JONES,MANAGER,2975,20
7654,MARTIN,SALESMAN,1250,30
7698,BLAKE,MANAGER,2850,30
7782,CLARK,MANAGER,2450,10
7788,SCOTT,ANALYST,3000,20
7839,KING,PRESIDENT,5000,10
7844,TURNER,SALESMAN,1500,30
7876,ADAMS,CLERK,1100,20
7900,JAMES,CLERK,950,30
7902,FORD,ANALYST,3000,20
7934,MILLER,CLERK,1300,10
我的Map器类:
public class Groupmapper extends Mapper<Object,Text,IntWritable,IntWritable> {
@Override
public void map(Object key, Text value, Context context) throws IOException, InterruptedException{
String line = value.toString();
String[] parts=line.split(",");
String token1=parts[3];
String token2=parts[4];
int deptno=Integer.parseInt(token2);
int sal=Integer.parseInt(token1);
context.write(new IntWritable(deptno),new IntWritable(sal));
}
}
减速器等级:
public class Groupreducer extends Reducer<IntWritable, IntWritable, IntWritable, IntWritable> {
IntWritable result=new IntWritable();
public void Reduce(IntWritable key,Iterable<IntWritable> values, Context context) throws IOException, InterruptedException{
int sum=0;
for(IntWritable val:values){
sum+=val.get();
}
result.set(sum);
context.write(key,result);
}
}
驾驶员等级:
public class Group {
public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException {
Configuration conf=new Configuration();
Job job=Job.getInstance(conf,"Group");
job.setJarByClass(Group.class);
job.setMapperClass(Groupmapper.class);
job.setCombinerClass(Groupreducer.class);
job.setReducerClass(Groupreducer.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
预期输出应为:
10 8750
20 10875
30 9400
但它会打印下面给出的输出。它没有聚合这些值。它就像身份还原剂。
10 1300
10 5000
10 2450
20 1100
20 3000
20 800
20 2975
20 3000
30 1500
30 1600
30 2850
30 1250
30 1250
30 950
减速器功能不正常。
1条答案
按热度按时间k3fezbri1#
看起来好像没有使用reduce。因此,在调试的下一步是更仔细地查看减速机。
如果您添加
@Override
对于reduce方法(就像在map方法上一样),您将看到Method does not override method from its superclass
错误。这意味着hadoop不会使用reduce方法,而是使用默认的标识实现。问题是你有:
public void Reduce(IntWritable key,Iterable<IntWritable> values, Context context)
应该是:public void reduce(IntWritable key,Iterable<IntWritable> values, Context context)
唯一的区别是方法的名称应该以小写字母开头r
.