使用emr上的pig mongodb hadoop连接器时出现“错误6000,输出位置验证失败”

k75qkfdt  于 2021-05-30  发布在  Hadoop
关注(0)|答案(1)|浏览(368)

我在emr上的pig脚本中得到一个“output location validation failed”异常。将数据保存回s3时失败。我使用这个简单的脚本来缩小问题范围:

REGISTER /home/hadoop/lib/mongo-java-driver-2.13.0.jar  
REGISTER /home/hadoop/lib/mongo-hadoop-core-1.3.2.jar
REGISTER /home/hadoop/lib/mongo-hadoop-pig-1.3.2.jar

example = LOAD 's3://xxx/example-full.bson'
         USING com.mongodb.hadoop.pig.BSONLoader();

STORE example INTO 's3n://xxx/out/example.bson' USING com.mongodb.hadoop.pig.BSONStorage();

这是生成的堆栈跟踪:

================================================================================
Pig Stack Trace
---------------
ERROR 6000:
<line 8, column 0> Output Location Validation Failed for: 's3://xxx/out/example.bson More info to follow:
Output directory not set.

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to store alias example
    at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1637)
    at org.apache.pig.PigServer.registerQuery(PigServer.java:577)
    at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1091)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:501)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
    at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
    at org.apache.pig.Main.run(Main.java:543)
    at org.apache.pig.Main.main(Main.java:156)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 6000:
<line 8, column 0> Output Location Validation Failed for: 's3://xxx/out/example.bson More info to follow:
Output directory not set.
    at org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:95)
    at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:66)
    at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
    at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
    at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
    at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
    at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
    at org.apache.pig.newplan.logical.rules.InputOutputFileValidator.validate(InputOutputFileValidator.java:45)
    at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:317)
    at org.apache.pig.PigServer.compilePp(PigServer.java:1382)
    at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1307)
    at org.apache.pig.PigServer.execute(PigServer.java:1299)
    at org.apache.pig.PigServer.access$400(PigServer.java:124)
    at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1632)
    ... 13 more
Caused by: org.apache.hadoop.mapred.InvalidJobConfException: Output directory not set.
    at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:138)
    at org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:80)
    ... 26 more

为了设置mongoconnector,我使用了以下引导脚本:


# !/bin/sh

wget -P /home/hadoop/lib http://central.maven.org/maven2/org/mongodb/mongo-java-driver/2.13.0/mongo-java-driver-2.13.0.jar
wget -P /home/hadoop/lib https://github.com/mongodb/mongo-hadoop/releases/download/r1.3.2/mongo-hadoop-core-1.3.2.jar
wget -P /home/hadoop/lib https://github.com/mongodb/mongo-hadoop/releases/download/r1.3.2/mongo-hadoop-pig-1.3.2.jar
wget -P /home/hadoop/lib https://github.com/mongodb/mongo-hadoop/releases/download/r1.3.2/mongo-hadoop-hive-1.3.2.jar

cp /home/hadoop/lib/mongo* /home/hadoop/hive/lib
cp /home/hadoop/lib/mongo* /home/hadoop/pig/lib
mepcadol

mepcadol1#

错误表明输出目录不存在。
当然,解决方案是创建输出目录。
为了快速检查,还可以使输出目录与输入目录相等。如果目录确实存在,则可能是版权问题。

相关问题