我的Pig被运行代码温度和我的一个错误,把下面的代码和错误,以方便了解我的问题发生。
错误在第38行第15列,试图删除drytemp,但也给出了另一个错误。
代码:
--Load files into relations
month1 = LOAD 'hdfs:/data/big/data/weather/weather/201201hourly.txt' USING PigStorage(',');
month2 = LOAD 'hdfs:/data/big/data/weather/weather/201202hourly.txt' USING PigStorage(',');
month3 = LOAD 'hdfs:/data/big/data/weather/weather/201203hourly.txt' USING PigStorage(',');
month4 = LOAD 'hdfs:/data/big/data/weather/weather/201204hourly.txt' USING PigStorage(',');
month5 = LOAD 'hdfs:/data/big/data/weather/weather/201205hourly.txt' USING PigStorage(',');
month6 = LOAD 'hdfs:/data/big/data/weather/weather/201206hourly.txt' USING PigStorage(',');
--Combine relations
months = UNION month1, month2, month3, month4, month5, month6;
/* Splitting relations
SPLIT months INTO
splitMonth1 IF SUBSTRING(date, 4, 6) == '01',
splitMonth2 IF SUBSTRING(date, 4, 6) == '02',
splitMonth3 IF SUBSTRING(date, 4, 6) == '03',
splitRest IF (SUBSTRING(date, 4, 6) == '04' OR SUBSTRING(date, 4, 6) == '04');
*/
/* Joining relations
stations = LOAD 'hdfs:/data/big/data/QCLCD201211/stations.txt' USING PigStorage() AS (id:int, name:chararray)
JOIN months BY wban, stations by id;
*/
--filter out unwanted data
clearWeather = FILTER months BY skyCondition == 'CLR';
--Transform and shape relation
shapedWeather = FOREACH clearWeather GENERATE date, SUBSTRING(date, 0, 4) as year, SUBSTRING(date, 4, 6) as month, SUBSTRING(date, 6, 8) as day, skyCondition, dryTemp;
--Group relation specifying number of reducers
groupedByMonthDay = GROUP shapedWeather BY (month, day) PARALLEL 10;
--Aggregate relation
aggedResults = FOREACH groupedByMonthDay GENERATE group as MonthDay, AVG(shapedWeather.dryTemp), MIN(shapedWeather.dryTemp), MAX(shapedWeather.dryTemp), COUNT(shapedWeather.dryTemp) PARALLEL 10;
--Sort relation
sortedResults = ORDER aggedResults BY $1 DESC;
--Store results in HDFS
STORE sortedResults INTO 'hdfs:/data/big/data/weather/pigresults' USING PigStorage(':');
放下错误,他有点大,对Pig还是不太了解,我还在研究,相信错误跟变量的类型有关,就是不认识但不知道修一下,希望对我有帮助。
错误:
ERROR 1128: Cannot find field dryTemp in :bytearray,year:chararray,month:chararray,day:chararray,:bytearray,:bytearray
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing. Cannot find field dryTemp in :bytearray,year:chararray,month:chararray,day:chararray,:bytearray,:bytearray
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1691)
at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411)
at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344)
at org.apache.pig.PigServer.executeBatch(PigServer.java:369)
at org.apache.pig.PigServer.executeBatch(PigServer.java:355)
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:607)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: Failed to parse: Pig script failed to parse:
<file Documentos/pig/weather.pig, line 38, column 15> pig script failed to validate: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1128: Cannot find field dryTemp in :bytearray,year:chararray,month:chararray,day:chararray,:bytearray,:bytearray
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678)
... 15 more
Caused by:
<file Documentos/pig/weather.pig, line 38, column 15> pig script failed to validate: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1128: Cannot find field dryTemp in :bytearray,year:chararray,month:chararray,day:chararray,:bytearray,:bytearray
at org.apache.pig.parser.LogicalPlanBuilder.buildForeachOp(LogicalPlanBuilder.java:1017)
at org.apache.pig.parser.LogicalPlanGenerator.foreach_clause(LogicalPlanGenerator.java:15870)
at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1933)
at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188)
... 16 more
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1128: Cannot find field dryTemp in :bytearray,year:chararray,month:chararray,day:chararray,:bytearray,:bytearray
at org.apache.pig.newplan.logical.expression.DereferenceExpression.translateAliasToPos(DereferenceExpression.java:215)
at org.apache.pig.newplan.logical.expression.DereferenceExpression.getFieldSchema(DereferenceExpression.java:149)
at org.apache.pig.newplan.logical.optimizer.FieldSchemaResetter.execute(SchemaResetter.java:264)
at org.apache.pig.newplan.logical.expression.AllSameExpressionVisitor.visit(AllSameExpressionVisitor.java:148)
at org.apache.pig.newplan.logical.expression.DereferenceExpression.accept(DereferenceExpression.java:84)
at org.apache.pig.newplan.ReverseDependencyOrderWalker.walk(ReverseDependencyOrderWalker.java:70)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
at org.apache.pig.newplan.logical.optimizer.SchemaResetter.visitAll(SchemaResetter.java:67)
at org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:122)
at org.apache.pig.newplan.logical.relational.LOGenerate.accept(LOGenerate.java:245)
at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
at org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:114)
at org.apache.pig.parser.LogicalPlanBuilder.buildForeachOp(LogicalPlanBuilder.java:1015)
... 22 more
以下是文件201211 hourly.txt的几行:
wban、日期、时间、站类型、天空条件、天空条件标志、可见性、可见性标志、天气类型、天气类型标志、干球风廓、干球风廓、干球风廓、干球风廓、湿球风廓、湿球风廓、湿球风廓、湿球风廓、湿球风廓、湿球风廓、湿球风廓、露点风廓、露点风廓、露点摄氏度、露点风廓、相对湿度,相对湿度标志,风速,风速标志,风向,风向标志,风向特征值,风向特征值标志,站点压力,站点压力标志,压力依赖性,压力依赖性标志,压力变化,压力变化标志,海平面压力,海平面压力标志,记录类型,记录类型标志,小时往复,小时往复,高度表,高度计标志03011201201010015,0,clr,10.00、-23、-5.0,15、-9.5、-9、-23.0、24、5120、-21.70、、m、aa、-30.43、03011201201010035,0,clr,10.00、-21、-6.0,14、-10.2、-9、-23.0、26、6130、-21.70、-m、aa、-30.43、03011201201010055,0,clr,10.00、-21、-6.0,13、-10.5、-13、-25.0、21、0000、-21.71、、m、aa、-30.44、03011201201010115,0,clr,10.00、-21、-6.0,14、-10.1、-8、-22.0、27、0000、21.71、、m、aa、30.44、03011201201010135、0、clr、10.00、21、-6.0、13、-10.4、-11、-24.0、23、0000、21.72、、m、aa、30.45、03011201201010155、0、clr、10.00、21、-6.0、13、-10.5、-13、-25.0、21、6130、、21.72、、m、aa、30.45、03011201201010215、0、clr、10.00、21、-6.0、14、-10.2、-9、-23.0、26、5090、、21.73、、m、aa、30.46、,03011201201010235,0,clr,10.00,21,-6.0,14,-10.2,-9,-23.0,26,6120,21.74,,,m,aa,,30.47,03011201201010255,0,clr,10.00,21,-6.0,13,-10.4,-11,-24.0,23,7130,21.74,,,m,aa,,30.48,03011201201010315,0,clr,10.00,23,-5.0,15,-9.4,-8,-22.0,25,9120,21.74,,,m,aa,,30.47,03011201010335,0,clr,10.00,23,-5.0,15,-9.4,-8,-22.0、25、8120、21.74、m、aa、30.47、03011201201010355、0、clr、10.00、21、-6.0、14、-10.2、-9、-23.0、26、7120、21.73、m、aa、30.46、03011201201010415、0、clr、10.00、23、-5.0、14、-9.7、-13、-25.0、19、7130、21.73、m、aa、30.46、,
2条答案
按热度按时间wwodge7n1#
看起来您正在加载'month1'、'month2'等,但没有指定架构(您应该在其中指定'drytemp')。您可以尝试以下方法:
其他月份也一样。
谢谢
fnx2tebb2#
我对你的剧本做了一些修改,
1使用适当的模式加载数据(您可以根据需要更改每个字段的数据类型)
2将所有6个负载优化为1个负载。
三。删除了注解代码
我已经用你的输入测试了下面的pig脚本,它工作得很好,还粘贴了输出。
Pig手稿:
输出:(基于以上输入示例)