如何访问pig latin中日期的每个元素?

f2uvfpb9  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(361)

查询:

records = LOAD 'input' using PigStorage(' ') as (id:int, name:chararray, desination:chararray, date:chararray, salary: long);

样本输入:

(10102,neha,developer,14/02/13,32000)

    (10103,deva,admin,02/02/14,40000)

    (10102,neha,developer,01/01/14,45000)

    (10245,sasi,developer,01/01/14,20000)

    (10109,surya,manager,01/02/2014,56000)

    (10102,neha,developer,01/02/2014,45000)

    (10245,sasi,developer,02/01/2014,25000)

我想根据日期的年份(不是整个日期)过滤上述数据。

mccptt67

mccptt671#

检查一下这个是否适合你。

records = LOAD '/home/abhijit/Downloads/movies.txt' using PigStorage(',') as (id:int, name:chararray, desination:chararray, date:chararray, salary:int);

todate_data = foreach records generate id,name,destination,date, salary,ToDate(date,'yyyy/MM/dd HH:mm:ss') as (date_time:DateTime );

todate_data = foreach records generate name,desination,ToDate(date,'dd/MM/yyyy') as (date_time:DateTime );

getyear_data = foreach todate_data generate name,desination,GetYear(date_time);

groupByYear = group getyear_data by $3;

最终输出为:

(2013,{(neha,developer,2013)})
(2014,{(deva,admin,2014),(neha,developer,2014),(sasi,developer,2014),(surya,manager,2014),(neha,developer,2014),(sasi,developer,2014)})

相关问题