hive压缩

rkkpypqq  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(575)

我在试着压缩 RC 以及 ORC 文件使用 LZ4 . 我已经安装了hadoop-2.7.1和hive-1.2.1。万一 LZ4 ,我可以压缩 RC 文件没有任何问题。但是,当我尝试加载数据时 ORC 文件使用 LZ4 ,它不工作。我创造了 ORC 下表所示:

CREATE TABLE FINANCE_orc(
    PERMNO STRING,
    DATE STRING,
    CUSIP STRING,
    NCUSIP STRING,
    COMNAM STRING,
    TICKET STRING,
    PERMCO STRING,
    SHRCD STRING,
    EXCHCD STRING,
    HEXCD STRING,
    SICCD STRING,
    HSLCCD STRING,
    PRC STRING,
    VOL STRING,
    RET STRING,
    SHROUT STRING,
    DLRET STRING,
    VWRETD STRING,
    EWRETD STRING,
    SPRTRN STRING)
STORED AS ORC tblproperties ("orc.compress"="Lz4");

set mapred.output.compress=true; 
set hive.exec.compress.output=true; 
set mapred.output.compression.type = BLOCK;
set mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec; 
set io.compression.codecs=org.apache.hadoop.io.compress.GzipCodec; 

INSERT OVERWRITE table finance_orc select * from finance;

但在加载数据时,会出现以下错误:

Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"permno":"PERMNO","ndate":"DATE","cusip":"CUSIP","ncusip":"NCUSIP","comnam":"COMNAM","ticket":"TICKER","permco":"PERMCO","shrcd":"SHRCD","exchcd":"EXCHCD","hexcd":"HEXCD","siccd":"SICCD","hslccd":"HSICCD","prc":"PRC","vol":"VOL","ret":"RET","shrout":"SHROUT","dlret":"DLRET","vwretd":"VWRETD","ewretd":"EWRETD","sprtrn":"SPRTRN"}
    at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"permno":"PERMNO","ndate":"DATE","cusip":"CUSIP","ncusip":"NCUSIP","comnam":"COMNAM","ticket":"TICKER","permco":"PERMCO","shrcd":"SHRCD","exchcd":"EXCHCD","hexcd":"HEXCD","siccd":"SICCD","hslccd":"HSICCD","prc":"PRC","vol":"VOL","ret":"RET","shrout":"SHROUT","dlret":"DLRET","vwretd":"VWRETD","ewretd":"EWRETD","sprtrn":"SPRTRN"}
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:518)
    at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
    ... 8 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.hive.ql.io.orc.CompressionKind.Lz4
    at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:577)
    at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:675)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
    at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
    at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
    at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162)
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508)
    ... 9 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.hive.ql.io.orc.CompressionKind.Lz4
    at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:249)
    at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:622)
    at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:566)
    ... 16 more
Caused by: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.hive.ql.io.orc.CompressionKind.Lz4
    at java.lang.Enum.valueOf(Enum.java:236)
    at org.apache.hadoop.hive.ql.io.orc.CompressionKind.valueOf(CompressionKind.java:25)
    at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat.getOptions(OrcOutputFormat.java:143)
    at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat.getHiveRecordWriter(OrcOutputFormat.java:203)
    at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat.getHiveRecordWriter(OrcOutputFormat.java:52)
    at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:261)
    at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:246)
    ... 18 more
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 4   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

我用过 Snappy 以及 Zlib 使用相同的命令,它工作正常。但问题只是 LZ4 . 我不知道为什么?

fsi0uk1n

fsi0uk1n1#

除了orc列压缩之外,我们还可以使用以下压缩:none、zlib、snappy。
默认的压缩编解码器是zlib。
不允许使用上述以外的压缩编解码器。
一般来说,要了解错误,请完整地阅读错误日志,在一定程度上找出问题所在。错误日志说-

"org.apache.hadoop.hive.ql.metadata.HiveException:   java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.hive.ql.io.orc.CompressionKind.Lz4"

相关问题