我正在尝试编写一个运行pig的emr作业,该作业将写入dse,我们将使用dse提供服务。不幸的是,我无法让pig写入dse,所以我将问题分解为只连接到dse节点并尝试写入它。这就是我要做的
在cassandra节点上:
cqlsh> CREATE KEYSPACE cql3ks WITH replication =
{'class': 'SimpleStrategy', 'replication_factor': 1 };
cqlsh> USE cql3ks
cqlsh:cql3ks> CREATE TABLE test (a int PRIMARY KEY, b int);
从本地计算机
export PIG_INITIAL_ADDRESS=<cassandra node IP>
export PIG_RPC_PORT=9160
export PIG_PARTITIONER=org.apache.cassandra.dht.Murmur3Partitioner
pig -x local
grunt> REGISTER /var/lib/cassandra/resources/cassandra/lib/libthrift-0.7.0.jar;
grunt> REGISTER /var/lib/cassandra/resources/cassandra/lib/cassandra-thrift-1.2.13.2.jar;
grunt> REGISTER /var/lib/cassandra/resources/cassandra/lib/cassandra-all-1.2.13.2.jar;
grunt> DEFINE CqlStorage org.apache.cassandra.hadoop.pig.CqlStorage();
grunt> moretestvalues= LOAD 'cql://cql3ks/test/' USING CqlStorage;
grunt> insertformat= FOREACH moretestvalues GENERATE TOTUPLE(TOTUPLE('a',a)),TOTUPLE(b);
grunt> STORE insertformat INTO 'cql://cql3ks/test?output_query=UPDATE+cql3ks.test+set+b+%3D+%3F' USING CqlStorage();
执行此操作时,会出现以下错误:
2014-02-25 18:50:27,952 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2014-02-25 18:50:28,506 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
2014-02-25 18:50:28,506 [main] WARN org.apache.pig.tools.grunt.Grunt - There is no log file to write to.
2014-02-25 18:50:28,506 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at org.apache.cassandra.hadoop.AbstractColumnFamilyOutputFormat.checkOutputSpecs(AbstractColumnFamilyOutputFormat.java:75)
at org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:80)
at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:66)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
at org.apache.pig.newplan.logical.rules.InputOutputFileValidator.validate(InputOutputFileValidator.java:45)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:288)
at org.apache.pig.PigServer.compilePp(PigServer.java:1322)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1247)
at org.apache.pig.PigServer.execute(PigServer.java:1239)
at org.apache.pig.PigServer.access$400(PigServer.java:121)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1553)
at org.apache.pig.PigServer.registerQuery(PigServer.java:516)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:991)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
at org.apache.pig.Main.run(Main.java:538)
at org.apache.pig.Main.main(Main.java:157)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
1条答案
按热度按时间ux6nzvsh1#
这是版本的问题。您可能正在使用hadoop2.x,而cassandra库正在使用hadoop1.xapi。如果没有,请检查是否使用了正确的jar。
下一个cassandra错误修复版本(2.0.6)将包括这两个api的兼容性,或者至少这个问题是这样说的。