hive 当从配置单元中的不同表向分区中插入覆盖时,是否有特定的规则?

qnyhuwrf  于 2022-11-05  发布在  Hive
关注(0)|答案(1)|浏览(143)

我正在尝试插入覆盖数据到我的分区表如下:

insert OVERWRITE TABLE  schema1.partition_table partition (flag='N')
select a,b,c,d, cast(null as string) as id, cast(null as string) as f,last_load_date FROM schema1.data_table;

insert OVERWRITE TABLE  schema1.partition_table partition (flag)
select a,b,c,d, cast(null as string) as srctaid, cast(null as string) as ratingreversed,last_load_date,'N' as flag FROM schema1.data_table
;

不管schema1.data_table是否有列名标志(即schema1.partition_table的分区列),我都会得到这个错误。
所以我的问题是我们如何执行插入操作覆盖partiton标志,其中标志将是“N”。
下面是我得到的错误:

select a,b,c,d, cast(null as string) as id, cast(null as string) as f,
 'N' as srctaid_flag FROM schema1.data_table
INFO  : No Stats for schema1@data_table, Columns: a,b,c,d
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:_col0, type:string, comment:null), FieldSchema(name:_col1, type:string, comment:null), FieldSchema(name:_col2, type:string, comment:null), FieldSchema(name:_col3, type:string, comment:null), FieldSchema(name:_col4, type:string, comment:null), FieldSchema(name:_col5, type:string, comment:null), FieldSchema(name:_col6, type:string, comment:null), FieldSchema(name:_col7, type:string, comment:null), FieldSchema(name:_col8, type:string, comment:null), FieldSchema(name:_col9, type:string, comment:null), FieldSchema(name:_col10, type:string, comment:null), FieldSchema(name:_col11, type:string, comment:null), FieldSchema(name:_col12, type:string, comment:null), FieldSchema(name:_col13, type:string, comment:null), FieldSchema(name:_col14, type:string, comment:null), FieldSchema(name:_col15, type:double, comment:null), FieldSchema(name:_col16, type:string, comment:null), FieldSchema(name:_col17, type:string, comment:null), FieldSchema(name:_col18, type:bigint, comment:null), FieldSchema(name:_col19, type:string, comment:null), FieldSchema(name:_col20, type:string, comment:null), FieldSchema(name:_col21, type:string, comment:null), FieldSchema(name:_col22, type:string, comment:null), FieldSchema(name:_col23, type:string, comment:null), FieldSchema(name:_col24, type:string, comment:null), FieldSchema(name:_col25, type:string, comment:null), FieldSchema(name:_col26, type:string, comment:null), FieldSchema(name:_col27, type:string, comment:null), FieldSchema(name:_col28, type:string, comment:null), FieldSchema(name:_col29, type:string, comment:null), FieldSchema(name:_col30, type:string, comment:null), FieldSchema(name:_col31, type:string, comment:null), FieldSchema(name:_col32, type:bigint, comment:null), FieldSchema(name:_col33, type:string, comment:null), FieldSchema(name:_col34, type:string, comment:null), FieldSchema(name:_col35, type:string, comment:null), FieldSchema(name:_col36, type:string, comment:null), FieldSchema(name:_col37, type:string, comment:null), FieldSchema(name:_col38, type:string, comment:null), FieldSchema(name:_col39, type:string, comment:null), FieldSchema(name:_col40, type:string, comment:null), FieldSchema(name:_col41, type:double, comment:null), FieldSchema(name:_col42, type:string, comment:null), FieldSchema(name:_col43, type:string, comment:null), FieldSchema(name:_col44, type:string, comment:null), FieldSchema(name:_col45, type:string, comment:null), FieldSchema(name:_col46, type:string, comment:null), FieldSchema(name:_col47, type:string, comment:null), FieldSchema(name:_col48, type:string, comment:null), FieldSchema(name:_col49, type:string, comment:null), FieldSchema(name:_col50, type:string, comment:null), FieldSchema(name:_col51, type:string, comment:null), FieldSchema(name:_col52, type:string, comment:null), FieldSchema(name:_col53, type:string, comment:null), FieldSchema(name:_col54, type:string, comment:null), FieldSchema(name:_col55, type:string, comment:null), FieldSchema(name:_col56, type:string, comment:null), FieldSchema(name:_col57, type:string, comment:null), FieldSchema(name:_col58, type:string, comment:null), FieldSchema(name:_col59, type:string, comment:null), FieldSchema(name:_col60, type:string, comment:null), FieldSchema(name:_col61, type:string, comment:null), FieldSchema(name:_col62, type:string, comment:null), FieldSchema(name:_col63, type:string, comment:null), FieldSchema(name:_col64, type:string, comment:null), FieldSchema(name:_col65, type:string, comment:null), FieldSchema(name:_col66, type:string, comment:null), FieldSchema(name:_col67, type:string, comment:null), FieldSchema(name:_col68, type:string, comment:null), FieldSchema(name:_col69, type:string, comment:null), FieldSchema(name:_col70, type:string, comment:null), FieldSchema(name:_col71, type:string, comment:null), FieldSchema(name:_col72, type:string, comment:null), FieldSchema(name:_col73, type:string, comment:null), FieldSchema(name:_col74, type:string, comment:null), FieldSchema(name:_col75, type:string, comment:null), FieldSchema(name:_col76, type:timestamp, comment:null), FieldSchema(name:_col77, type:timestamp, comment:null), FieldSchema(name:_col78, type:timestamp, comment:null), FieldSchema(name:_col79, type:string, comment:null)], properties:null)
INFO  : Completed compiling command(queryId=hive_20220725044741_6eb010e5-5658-49c9-8e3e-e6662428b6b1); Time taken: 0.11 seconds
INFO  : Executing command(queryId=hive_20220725044741_6eb010e5-5658-49c9-8e3e-e6662428b6b1): insert overwrite table  schema1.data_table
select a,b,c,d ,cast(null as string) as id, cast(null as string) as f,
'N' as flag FROM schema1.data_table
INFO  : Query ID = hive_20220725044741_6eb010e5-5658-49c9-8e3e-e6662428b6b1
INFO  : Total jobs = 1
INFO  : Launching Job 1 out of 1
INFO  : Starting task [Stage-1:MAPRED] in serial mode
INFO  : Subscribed to counters: [] for queryId: hive_20220725044741_6eb010e5-5658-49c9-8e3e-e6662428b6b1
INFO  : Session is already open
INFO  : Dag name: insert overwrite tabl...ana.data_table (Stage-1)
INFO  : Status: Running (Executing on YARN cluster with App id application_1656421186200_57358)

----------------------------------------------------------------------------------------------
        VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... container     SUCCEEDED      1          1        0        0       0       0
Reducer 2 ...... container     SUCCEEDED      2          2        0        0       0       0
----------------------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 32.14 s
----------------------------------------------------------------------------------------------
INFO  : Status: DAG finished successfully in 9.04 seconds
INFO  :
INFO  : Query Execution Summary
INFO  : ----------------------------------------------------------------------------------------------
INFO  : OPERATION                            DURATION
INFO  : ----------------------------------------------------------------------------------------------
INFO  : Compile Query                           0.11s
INFO  : Prepare Plan                            0.08s
INFO  : Get Query Coordinator (AM)              0.00s
INFO  : Submit Plan                             0.03s
INFO  : Start DAG                               0.03s
INFO  : Run DAG                                 9.04s
INFO  : ----------------------------------------------------------------------------------------------
INFO  :
INFO  : Task Execution Summary
INFO  : ----------------------------------------------------------------------------------------------
INFO  :   VERTICES      DURATION(ms)   CPU_TIME(ms)    GC_TIME(ms)   INPUT_RECORDS   OUTPUT_RECORDS
INFO  : ----------------------------------------------------------------------------------------------
INFO  :      Map 1           3534.00          6,660             86               1                1
INFO  :  Reducer 2           3496.00          4,060             21               1                0
INFO  : ----------------------------------------------------------------------------------------------
INFO  :
INFO  : org.apache.tez.common.counters.DAGCounter:
INFO  :    NUM_SUCCEEDED_TASKS: 3
INFO  :    TOTAL_LAUNCHED_TASKS: 3
INFO  :    RACK_LOCAL_TASKS: 1
INFO  :    AM_CPU_MILLISECONDS: 490
INFO  :    AM_GC_TIME_MILLIS: 0
INFO  : File System Counters:
INFO  :    FILE_BYTES_READ: 784
INFO  :    FILE_BYTES_WRITTEN: 672
INFO  :    HDFS_BYTES_READ: 9865
INFO  :    HDFS_BYTES_WRITTEN: 9117
INFO  :    HDFS_READ_OPS: 9
INFO  :    HDFS_WRITE_OPS: 7
INFO  :    HDFS_OP_CREATE: 3
INFO  :    HDFS_OP_GET_FILE_STATUS: 7
INFO  :    HDFS_OP_MKDIRS: 2
INFO  :    HDFS_OP_OPEN: 2
INFO  :    HDFS_OP_RENAME: 2
INFO  : org.apache.tez.common.counters.TaskCounter:
INFO  :    REDUCE_INPUT_GROUPS: 1
INFO  :    REDUCE_INPUT_RECORDS: 1
INFO  :    COMBINE_INPUT_RECORDS: 0
INFO  :    SPILLED_RECORDS: 2
INFO  :    NUM_SHUFFLED_INPUTS: 1
INFO  :    NUM_SKIPPED_INPUTS: 1
INFO  :    NUM_FAILED_SHUFFLE_INPUTS: 0
INFO  :    MERGED_MAP_OUTPUTS: 1
INFO  :    GC_TIME_MILLIS: 107
INFO  :    TASK_DURATION_MILLIS: 6011
INFO  :    CPU_MILLISECONDS: 10720
INFO  :    PHYSICAL_MEMORY_BYTES: 10645143552
INFO  :    VIRTUAL_MEMORY_BYTES: 59829911552
INFO  :    COMMITTED_HEAP_BYTES: 10645143552
INFO  :    INPUT_RECORDS_PROCESSED: 1
INFO  :    INPUT_SPLIT_LENGTH_BYTES: 4884
INFO  :    OUTPUT_RECORDS: 1
INFO  :    OUTPUT_LARGE_RECORDS: 0
INFO  :    OUTPUT_BYTES: 2175
INFO  :    OUTPUT_BYTES_WITH_OVERHEAD: 2185
INFO  :    OUTPUT_BYTES_PHYSICAL: 616
INFO  :    ADDITIONAL_SPILLS_BYTES_WRITTEN: 0
INFO  :    ADDITIONAL_SPILLS_BYTES_READ: 616
INFO  :    ADDITIONAL_SPILL_COUNT: 0
INFO  :    SHUFFLE_CHUNK_COUNT: 1
INFO  :    SHUFFLE_BYTES: 616
INFO  :    SHUFFLE_BYTES_DECOMPRESSED: 2185
INFO  :    SHUFFLE_BYTES_TO_MEM: 0
INFO  :    SHUFFLE_BYTES_TO_DISK: 0
INFO  :    SHUFFLE_BYTES_DISK_DIRECT: 616
INFO  :    NUM_MEM_TO_DISK_MERGES: 0
INFO  :    NUM_DISK_TO_DISK_MERGES: 0
INFO  :    SHUFFLE_PHASE_TIME: 81
INFO  :    MERGE_PHASE_TIME: 97
INFO  :    FIRST_EVENT_RECEIVED: 67
INFO  :    LAST_EVENT_RECEIVED: 67
INFO  : HIVE:
INFO  :    CREATED_DYNAMIC_PARTITIONS: 1
INFO  :    CREATED_FILES: 2
INFO  :    DESERIALIZE_ERRORS: 0
INFO  :    RECORDS_IN_Map_1: 1
INFO  :    RECORDS_OUT_0: 1
INFO  :    RECORDS_OUT_1_schema1.data_table: 1
INFO  :    RECORDS_OUT_INTERMEDIATE_Map_1: 1
INFO  :    RECORDS_OUT_INTERMEDIATE_Reducer_2: 0
INFO  :    RECORDS_OUT_OPERATOR_FS_10: 1
INFO  :    RECORDS_OUT_OPERATOR_FS_3: 1
INFO  :    RECORDS_OUT_OPERATOR_GBY_6: 1
INFO  :    RECORDS_OUT_OPERATOR_GBY_8: 1
INFO  :    RECORDS_OUT_OPERATOR_MAP_0: 0
INFO  :    RECORDS_OUT_OPERATOR_RS_7: 1
INFO  :    RECORDS_OUT_OPERATOR_SEL_1: 1
INFO  :    RECORDS_OUT_OPERATOR_SEL_5: 1
INFO  :    RECORDS_OUT_OPERATOR_SEL_9: 1
INFO  :    RECORDS_OUT_OPERATOR_TS_0: 1
INFO  :    TOTAL_TABLE_ROWS_WRITTEN: 1
INFO  : Shuffle Errors:
INFO  :    BAD_ID: 0
INFO  :    CONNECTION: 0
INFO  :    IO_ERROR: 0
INFO  :    WRONG_LENGTH: 0
INFO  :    WRONG_MAP: 0
INFO  :    WRONG_REDUCE: 0
INFO  : Shuffle Errors_Reducer_2_INPUT_Map_1:
INFO  :    BAD_ID: 0
INFO  :    CONNECTION: 0
INFO  :    IO_ERROR: 0
INFO  :    WRONG_LENGTH: 0
INFO  :    WRONG_MAP: 0
INFO  :    WRONG_REDUCE: 0
INFO  : TaskCounter_Map_1_INPUT_data_table:
INFO  :    INPUT_RECORDS_PROCESSED: 1
INFO  :    INPUT_SPLIT_LENGTH_BYTES: 4884
INFO  : TaskCounter_Map_1_OUTPUT_Reducer_2:
INFO  :    ADDITIONAL_SPILLS_BYTES_READ: 0
INFO  :    ADDITIONAL_SPILLS_BYTES_WRITTEN: 0
INFO  :    ADDITIONAL_SPILL_COUNT: 0
INFO  :    OUTPUT_BYTES: 2175
INFO  :    OUTPUT_BYTES_PHYSICAL: 616
INFO  :    OUTPUT_BYTES_WITH_OVERHEAD: 2185
INFO  :    OUTPUT_LARGE_RECORDS: 0
INFO  :    OUTPUT_RECORDS: 1
INFO  :    SHUFFLE_CHUNK_COUNT: 1
INFO  :    SPILLED_RECORDS: 1
INFO  : TaskCounter_Reducer_2_INPUT_Map_1:
INFO  :    ADDITIONAL_SPILLS_BYTES_READ: 616
INFO  :    ADDITIONAL_SPILLS_BYTES_WRITTEN: 0
INFO  :    COMBINE_INPUT_RECORDS: 0
INFO  :    FIRST_EVENT_RECEIVED: 67
INFO  :    LAST_EVENT_RECEIVED: 67
INFO  :    MERGED_MAP_OUTPUTS: 1
INFO  :    MERGE_PHASE_TIME: 97
INFO  :    NUM_DISK_TO_DISK_MERGES: 0
INFO  :    NUM_FAILED_SHUFFLE_INPUTS: 0
INFO  :    NUM_MEM_TO_DISK_MERGES: 0
INFO  :    NUM_SHUFFLED_INPUTS: 1
INFO  :    NUM_SKIPPED_INPUTS: 1
INFO  :    REDUCE_INPUT_GROUPS: 1
INFO  :    REDUCE_INPUT_RECORDS: 1
INFO  :    SHUFFLE_BYTES: 616
INFO  :    SHUFFLE_BYTES_DECOMPRESSED: 2185
INFO  :    SHUFFLE_BYTES_DISK_DIRECT: 616
INFO  :    SHUFFLE_BYTES_TO_DISK: 0
INFO  :    SHUFFLE_BYTES_TO_MEM: 0
INFO  :    SHUFFLE_PHASE_TIME: 81
INFO  :    SPILLED_RECORDS: 1
INFO  : TaskCounter_Reducer_2_OUTPUT_out_Reducer_2:
INFO  :    OUTPUT_RECORDS: 0
INFO  : org.apache.hadoop.hive.ql.exec.tez.HiveInputCounters:
INFO  :    GROUPED_INPUT_SPLITS_Map_1: 1
INFO  :    INPUT_DIRECTORIES_Map_1: 1
INFO  :    INPUT_FILES_Map_1: 1
INFO  :    RAW_INPUT_SPLITS_Map_1: 1
INFO  : Starting task [Stage-2:DEPENDENCY_COLLECTION] in parallel
INFO  : Starting task [Stage-0:MOVE] in serial mode
INFO  : Loading data to table schema1.data_table partition (srctaid_flag=null) from hdfs://a.b.com:8020/warehouse/tablespace/managed/hive/schema1.db/data_table/.hive-staging_hive_2022-07-25_04-47-41_938_6589433878269268017-5887/-ext-10000
INFO  :

INFO  :          Time taken to load dynamic partitions: 0.18 seconds
INFO  :          Time taken for adding to write entity : 0.004 seconds
INFO  : Starting task [Stage-3:STATS] in parallel
INFO  : Executing stats task
INFO  : Partition {srctaid_flag=N} stats: [numFiles=1, numRows=1, totalSize=6210, rawDataSize=0, numFilesErasureCoded=0]
INFO  : Partition {srctaid_flag=N} stats: [numFiles=1, numRows=1, totalSize=6210, rawDataSize=0, numFilesErasureCoded=0]
ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.StatsTask. MetaException(message:Insert of object "org.apache.hadoop.hive.metastore.model.MPartitionColumnStatistics@1a083863" using statement "INSERT INTO `PART_COL_STATS` (`CS_ID`,`AVG_COL_LEN`,`BIT_VECTOR`,`CAT_NAME`,`COLUMN_NAME`,`COLUMN_TYPE`,`DB_NAME`,`BIG_DECIMAL_HIGH_VALUE`,`BIG_DECIMAL_LOW_VALUE`,`DOUBLE_HIGH_VALUE`,`DOUBLE_LOW_VALUE`,`ENGINE`,`LAST_ANALYZED`,`LONG_HIGH_VALUE`,`LONG_LOW_VALUE`,`MAX_COL_LEN`,`NUM_DISTINCTS`,`NUM_FALSES`,`NUM_NULLS`,`NUM_TRUES`,`PART_ID`,`PARTITION_NAME`,`TABLE_NAME`) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)" failed : Unknown column 'BIT_VECTOR' in 'field list')
INFO  : Completed executing command(queryId=hive_20220725044741_6eb010e5-5658-49c9-8e3e-e6662428b6b1); Time taken: 32.237 seconds
INFO  : OK
Error: Error while compiling statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.StatsTask. MetaException(message:Insert of object "org.apache.hadoop.hive.metastore.model.MPartitionColumnStatistics@1a083863" using statement "INSERT INTO `PART_COL_STATS` (`CS_ID`,`AVG_COL_LEN`,`BIT_VECTOR`,`CAT_NAME`,`COLUMN_NAME`,`COLUMN_TYPE`,`DB_NAME`,`BIG_DECIMAL_HIGH_VALUE`,`BIG_DECIMAL_LOW_VALUE`,`DOUBLE_HIGH_VALUE`,`DOUBLE_LOW_VALUE`,`ENGINE`,`LAST_ANALYZED`,`LONG_HIGH_VALUE`,`LONG_LOW_VALUE`,`MAX_COL_LEN`,`NUM_DISTINCTS`,`NUM_FALSES`,`NUM_NULLS`,`NUM_TRUES`,`PART_ID`,`PARTITION_NAME`,`TABLE_NAME`) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)" failed : Unknown column 'BIT_VECTOR' in 'field list') (state=08S01,code=1)
toe95027

toe950271#

实际错误为Unknown column 'BIT_VECTOR' in 'field list')。请确保在表PART_COL_STATS中有此列。
查看错误,问题中的示例似乎与日志中的实际插入SQL不相同。
错误日志显示-
它尝试为所选列插入PART_COL_STATS,但在这些列中,表中缺少某个列。

INSERT INTO `PART_COL_STATS` (`CS_ID`,`AVG_COL_LEN`,`BIT_VECTOR`,`CAT_NAME`,`COLUMN_NAME`,`COLUMN_TYPE`,`DB_NAME`,`BIG_DECIMAL_HIGH_VALUE`,`BIG_DECIMAL_LOW_VALUE`,`DOUBLE_HIGH_VALUE`,`DOUBLE_LOW_VALUE`,`ENGINE`,`LAST_ANALYZED`,`LONG_HIGH_VALUE`,`LONG_LOW_VALUE`,`MAX_COL_LEN`,`NUM_DISTINCTS`,`NUM_FALSES`,`NUM_NULLS`,`NUM_TRUES`,`PART_ID`,`PARTITION_NAME`,`TABLE_NAME`) 
VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)" failed : Unknown column 'BIT_VECTOR' in 'field list')

相关问题