hive表存储为parquet fail

p5cysglq  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(412)

我在尝试将数据插入存储为parquet的配置单元表时接收到。我有一张103209的table。当我在select语句中编写limit子句时,它就起作用了。

CREATE EXTERNAL TABLE test_parquet (a bigint,b int)
STORED AS PARQUET LOCATION 's3n://abc/test_parquet';

insert into test_parquet select a,b from stage_mri_travel limit 100; --- works
insert into test_parquet select a,b from stage_mri_travel; --- fails

Error:
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:251)
    ... 11 more
Caused by: java.lang.NoSuchMethodError: parquet.schema.Types$MessageTypeBuilder.addFields([Lparquet/schema/Type;)Lparquet/schema/Types$BaseGroupBuilder;
    at org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.getSchemaByName(DataWritableReadSupport.java:158)
    at org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:221)
    at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:256)
    at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:95)
    at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:81)
    at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:72)
    at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:66)
fzsnzjdm

fzsnzjdm1#

可能存在旧的(即1.7.0)Parquet罐。您需要确保类路径中存在1.8.0 parquet jar。

相关问题