我使用下面的pig脚本来计算一个运行总数(pig本地模式)
Register /home/ec2-user/pig*/bin/piggybank-0.12.0.jar ;
define Sum org.apache.pig.piggybank.evaluation.int.Sum();
define Over org.apache.pig.piggybank.evaluation.Over();
define Stitch org.apache.pig.piggybank.evaluation.Stitch();
A = load '/home/ec2-user/staff_data.csv' using PigStorage(',') as (id:int, name:chararray, salary:int, department:chararray);
B = group A by department;
C = foreach B {
C1 = order A by salary;
generate flatten(Stitch(C1, Over(C1.department, 'Sum(C1.salary)')));
};
但是,我得到以下错误
未知总额(c1.工资)
有什么想法吗?
编辑:
我自己想出了答案。在这里:
Register /home/ec2-user/pig*/bin/piggybank-0.12.0.jar ;
define Over org.apache.pig.piggybank.evaluation.Over();
define Stitch org.apache.pig.piggybank.evaluation.Stitch();
A = load '/home/ec2-user/staff_data.csv' using PigStorage(',') as (id:int, name:chararray, salary:int, department:chararray);
B = group A by department;
C = foreach B {
C1 = order A by salary;
generate flatten(Stitch(C1, Over(C1.salary, 'sum(int)')));
};
暂无答案!
目前还没有任何答案,快来回答吧!