insert overwrite partition()检查分区是否存在

bpsygsoo  于 2021-06-24  发布在  Hive
关注(0)|答案(2)|浏览(451)

我想在“插入或覆盖”某个分区之前检查它是否已经存在。只需要在分区不存在时插入。
如何修改此查询?

INSERT OVERWRITE TABLE myname.mytable PARTITION (ds='2019-07-19')
ldioqlga

ldioqlga1#

如果使用动态分区的配置单元表插入/覆盖配置单元表,则只有在select语句中提取了该分区时,它才会覆盖。

SET hive.exec.dynamic.partition = true;
SET hive.exec.dynamic.partition.mode = nonstrict;

INSERT OVERWRITE TABLE db.PartitionTable PARTITION (native_country)
select
 age
,workclass
,education
,marital
,occupation
,race
,sex
,native_country
from db.sometable
where native_country='Vietnam';

另一种方法是:


# !/bin/bash

table="db.PartitionTable"
partition_col="native_country=Vietnam"
partition_lookup="native_country='Vietnam'"
partition_exists=$(hive -e "show partitions $table" | grep "$partition_col");

echo $partition_exists

# check if partition_exists

 if [ "$partition_exists" = "" ];
 then echo "partition does not exists";
 else echo "partition exists"
 hive -e "select * from $table where ${partition_lookup}" > output.tmp;
 fi
8ljdwjyq

8ljdwjyq2#

您还可以使用动态分区插入和 where partition column NOT in (select from myname.mytable) . 像这样:

INSERT OVERWRITE TABLE myname.mytable PARTITION (ds)
select col1, col2 ...
       '2019-07-19' ds --partition column
  from ... 
where ds not in (select distinct ds from myname.mytable where ds='2019-07-19')

在分区存在的情况下,select不会为现有分区返回任何行,因此不会覆盖分区。您也可以使用not exist for the same。

相关问题