将数据从pig推送到elasticsearch

xvw2m8pv  于 2021-06-25  发布在  Pig
关注(0)|答案(2)|浏览(322)

我编写了一个小脚本,将数据推送到elasticsearch:

REGISTER /path/to/elasticsearch-hadoop-1.0.0.jar;
DEFINE ESStorage org.elasticsearch.hadoop.pig.ESStorage('es.resource=sample/');
data = load 'somelog.log' using PigStorage('\n'); 
B = foreach data generate $0 as id;
STORE B INTO 'sample' USING ESStorage('es.http.timeout = 5m ; es.index.auto.create = false');

从命令行运行pig脚本时出现以下错误:

pig -Dpig.additional.jars=/path/to/elasticsearch-hadoop-1.0.0.jar script.pig 

Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.elasticsearch.hadoop.pig.ESStorage using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:653)
at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1257)

有什么建议吗?

jtw3ybtb

jtw3ybtb1#

事实上,我自己就知道,需要esstorage类(它不是elasticsearch-hadoop-1.0.0.jar的一部分)
所以我注册了“elasticsearch-hadoop-1.3.0.build snapshot.jar” git clone https://github.com/elasticsearch/elasticsearch-hadoop 然后我运行gradle来构建jar,并在pig脚本中使用elasticsearch-hadoop-1.3.0.build-snapshot.jar。

92dk7w1h

92dk7w1h2#

REGISTER /path/to/elasticsearch-hadoop-1.0.0.jar;

应指向正确的路径
就像 REGISTER /opt/elasticsearch-hadoop-1.3.0.M1/dist/elasticsearch-hadoop-1.3.0.M1-yarn.jar 然后,您可以使用以下工具运行它们:

pig -f script.pig

相关问题