我正在azure上使用hortonworks沙盒,我正在学习hadoop入门教程“Lab3-pig风险因素分析”
http://hortonworks.com/hadoop-tutorial/hello-world-an-introduction-to-hadoop-hcatalog-hive-and-pig/#section_5
在完成所有步骤并运行此pig脚本之后:
a = LOAD 'geolocation' using org.apache.hive.hcatalog.pig.HCatLoader();
b = filter a by event != 'normal';
c = foreach b generate driverid, event, (int) '1' as occurance;
d = group c by driverid;
e = foreach d generate group as driverid, SUM(c.occurance) as t_occ;
g = LOAD 'drivermileage' using org.apache.hive.hcatalog.pig.HCatLoader();
h = join e by driverid, g by driverid;
final_data = foreach h generate $0 as driverid, $1 as events, $3 as totmiles, (float) $3/$1 as riskfactor;
store final_data into 'riskfactor' using org.apache.hive.hcatalog.pig.HCatStorer();
单击“执行”将启动作业,但它几乎立即失败,并出现以下错误:
文件不存在:/tmp/.pigjobs/riskfactorpig\u 14-02-2016-22-29-58/stdout at org.apache.hadoop.hdfs.server.namenode.inodefile.valueof(inodefile。java:71)在org.apache.hadoop.hdfs.server.namenode.inodefile.valueof(inodefile。java:61)在org.apache.hadoop.hdfs.server.namenode.fsnamesystem.getblocklocationsint(fsnamesystem。java:1821)在org.apache.hadoop.hdfs.server.namenode.fsnamesystem.getblocklocations(fsnamesystem)。java:1792)在org.apache.hadoop.hdfs.server.namenode.fsnamesystem.getblocklocations(fsnamesystem。java:1705)在org.apache.hadoop.hdfs.server.namenode.namenoderpcserver.getblocklocations(namenoderpcserver。java:588)在org.apache.hadoop.hdfs.protocolpb.clientnamenodeprotocolserversidetranslatorpb.getblocklocations(clientnamenodeprotocolserversidetranslatorpb。java:365)org.apache.hadoop.hdfs.protocol.proto.clientnamenodeprotocolprotos$clientnamenodeprotocol$2.callblockingmethod(clientnamenodeprotocolprotos.java)位于org.apache.hadoop.ipc.protobufrpceengine$server$protobufrpinvoker.call(protobufrpceengine。java:616)在org.apache.hadoop.ipc.rpc$server.call(rpc。java:969)在org.apache.hadoop.ipc.server$handler$1.run(server。java:2137)在org.apache.hadoop.ipc.server$handler$1.run(server。java:2133)位于java.security.accesscontroller.doprivileged(本机方法)javax.security.auth.subject.doas(主题。java:415)在org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation。java:1657)在org.apache.hadoop.ipc.server$handler.run(server。java:2131)
hdfs中根本没有/tmp/.pigjobs/的文件。因此,就好像pig脚本需要在那里创建一个文件才能执行,但却找不到它。我使用“-usehcatalog”参数和“executing on tez”
不确定这是一个权限错误,还是一个azure错误,但这是非常令人沮丧的刚刚开始的教程,没有“沙盒”设置,以通过前几节课,而不必作出无数的调整配置设置。非常感谢您的帮助!
1条答案
按热度按时间zfycwa2u1#
通过一些额外的尝试和错误,我了解到通过右键单击/tmp/.pigjobs文件夹,我可以在hdfs中为该文件夹设置权限。默认情况下,三个“写入”选项中有两个未选中。打开所有允许的权限选项以保存和调用执行作业所需的文件。