我不能在某些领域使用我的自定义项,但我可以在其他领域使用它。如果我用我的第一个领域, ipAddress
,自定义项按预期工作。但是,如果我把它改成 date
我发现了1066的错误。这是我的剧本。
Pig脚本的工作和调用自定义项。
REGISTER myudfs.jar;
DEFINE HOUR myudfs.HOUR;
A = load 'access_log_Jul95' using PigStorage(' ') as (ip:chararray, dash1:chararray, dash2:chararray, date:chararray, date1:chararray, getRequset:chararray, location:chararray, http:chararray, code:int, port:int);
B = FOREACH A GENERATE HOUR(ip);
dump B;
pig脚本不起作用,并调用udf
REGISTER myudfs.jar;
DEFINE HOUR myudfs.HOUR;
A = load 'access_log_Jul95' using PigStorage(' ') as (ip:chararray, dash1:chararray, dash2:chararray, date:chararray, date1:chararray, getRequset:chararray, location:chararray, http:chararray, code:int, port:int);
B = FOREACH A GENERATE HOUR(date);
dump B;
pig脚本,但不调用udf
REGISTER myudfs.jar;
DEFINE HOUR myudfs.HOUR;
A = load 'access_log_Jul95' using PigStorage(' ') as (ip:chararray, dash1:chararray, dash2:chararray, date:chararray, date1:chararray, getRequset:chararray, location:chararray, http:chararray, code:int, port:int);
B = FOREACH A GENERATE date;
dump B;
样本数据
199.72.81.55 - - [01/Jul/1995:00:00:01 -0400] "GET /history/apollo/ HTTP/1.0" 200 6245
java自定义项
package myudfs;
import java.io.IOException;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.Tuple;
import org.apache.pig.impl.util.WrappedIOException;
public class HOUR extends EvalFunc<String>
{
@SuppressWarnings("deprecation")
public String exec(Tuple input) throws IOException {
if (input == null || input.size() == 0)
return " ";
try{
String str = (String)input.get(0);
return str.substring(0, 1);
}catch(Exception e){
throw WrappedIOException.wrap("Caught exception processing input row ", e);
}
}
}
错误
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias B
如果还有什么事,请告诉我。我在本地运行这个错误,并在map reduce上运行。
1条答案
按热度按时间trnvg8h31#
可能
date
有时是空的?在您的自定义项中,对元组有空检查,但对input.get(0)
如果发生这种情况,它将击中你的捕获块,你的自定义项将出错。可能导致了这个错误。。。