pig:after-join转储抛出错误1066:无法打开别名c的迭代器

rlcwz9us  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(273)

以下是我的要求:
输入:

0104919 ,08476,48528,2016,2016-08-29

00104919 ,08476,48528,2016,2016-09-05

00104919 ,08476,48528,2016,2016-09-12

00104919 ,08476,48528,2017,2016-08-29

联接后的输出应为:

2,00104919 ,08476,48528,2016,2016-09-05,2016-09-12

 3,00104919 ,08476,48528,2016,2016-09-12,2016-08-29

下面是我的代码:

TABL = LOAD '/TABL/part-r-00000' using PigStorage('~') AS (a,b,c,d,e,f);
    pre_Q1 = FOREACH TABL generate a,b,c,d,e;
    DIST = DISTINCT pre_Q1;
    ORDR = ORDER DIST BY *;
    Q1 = rank ORDR;
    Q2 = FOREACH Q1 GENERATE rank_ORDR + 1 AS rank_Q2, a, b, c, d, e;
    Q_join = join Q2 by (rank_Q2, a, b, c, d), Q1 by (rank_ORDR, a, b, c, d);
    C = limit Q_join 100;
    dump C;

我得到下面的错误。有人能指出是什么导致了下面的错误吗。

Failed Jobs:
JobId   Alias   Feature Message Outputs
job_1474127474437_528208        C,Q2,Q_join     HASH_JOIN       Message: Job failed!

Input(s):
Successfully read 5235587 records (1516199217 bytes) from: "/TABL/part-r-00000"

Output(s):

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1474127474437_528166        ->      job_1474127474437_528185,
job_1474127474437_528185        ->      job_1474127474437_528190,
job_1474127474437_528190        ->      job_1474127474437_528204,
job_1474127474437_528204        ->      job_1474127474437_528206,
job_1474127474437_528206        ->      job_1474127474437_528208,
job_1474127474437_528208        ->      null,
null

2017-01-04 04:02:37,407 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2017-01-04 04:02:37,569 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2017-01-04 04:02:37,729 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2017-01-04 04:02:37,887 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2017-01-04 04:02:37,945 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Some jobs have failed! Stop running all dependent jobs
2017-01-04 04:02:37,945 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias C
Details at logfile: /var/log/gphd/pig/pig.log
siv3szwd

siv3szwd1#

尝试修改第一行,如下所示:

TABL = LOAD '/TABL/part-r-00000' using PigStorage(',') AS (a,b,c,d,e,f);

当心那些 space 在列的末尾 a ,可能会影响连接!

相关问题