我编写了一个小脚本,将数据推送到elasticsearch:
REGISTER /path/to/elasticsearch-hadoop-1.0.0.jar;
DEFINE ESStorage org.elasticsearch.hadoop.pig.ESStorage('es.resource=sample/');
data = load 'somelog.log' using PigStorage('\n');
B = foreach data generate $0 as id;
STORE B INTO 'sample' USING ESStorage('es.http.timeout = 5m ; es.index.auto.create = false');
从命令行运行pig脚本时出现以下错误:
pig -Dpig.additional.jars=/path/to/elasticsearch-hadoop-1.0.0.jar script.pig
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.elasticsearch.hadoop.pig.ESStorage using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:653)
at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1257)
有什么建议吗?
2条答案
按热度按时间jtw3ybtb1#
事实上,我自己就知道,需要esstorage类(它不是elasticsearch-hadoop-1.0.0.jar的一部分)
所以我注册了“elasticsearch-hadoop-1.3.0.build snapshot.jar”
git clone https://github.com/elasticsearch/elasticsearch-hadoop
然后我运行gradle来构建jar,并在pig脚本中使用elasticsearch-hadoop-1.3.0.build-snapshot.jar。92dk7w1h2#
应指向正确的路径
就像
REGISTER /opt/elasticsearch-hadoop-1.3.0.M1/dist/elasticsearch-hadoop-1.3.0.M1-yarn.jar
然后,您可以使用以下工具运行它们: