下面是一个pig脚本的输出。
{(001,Kumar,Jayasuriya,1123456754,Matara), (001,Kumar,Sangakkara,112722892,Kandy),(001,Rajiv,Reddy,9848022337,Hyderabad)}
{(002,siddarth,Battacharya,9848022338,Kolkata)}
{(003,Rajesh,Khanna,9848022339,Delhi)}
{(004,Preethi,Agarwal,9848022330,Pune)}
{(005,Trupthi,Mohanthy,9848022336,Bhuwaneshwar)}
{(006,Archana,Mishra,9848022335,Chennai)}
{(007,Kumar,Dharmasena,758922419,Colombo)}
{(008,Mahela,Jayawerdana,765557103,Colombo)}
下面是我用来生成这个的脚本。
students = LOAD 'student_data10.txt' USING PigStorage(',') as (id:chararray,fname:chararray,lname:chararray,tp:chararray,city:chararray);
group_students = GROUP students by (id);
group_students2 = FOREACH group_students GENERATE $1;
我要把Pig袋子里的身份证拿出来。即“001”、“002”…“007”不应该出现在输出中。下面是我需要的输出示例。
{(Kumar,Jayasuriya,1123456754,Matara), (Kumar,Sangakkara,112722892,Kandy),(Rajiv,Reddy,9848022337,Hyderabad)}
{(siddarth,Battacharya,9848022338,Kolkata)}
我知道我可以通过对输出包中需要的所有列说table.columnname来实现这一点。但是我需要在不提及列名的情况下得到这个输出。我怎样才能做到这一点?任何帮助都将不胜感激。
暂无答案!
目前还没有任何答案,快来回答吧!