hive-更新分区列

tpxzln5u  于 2021-06-28  发布在  Hive
关注(0)|答案(1)|浏览(358)

我有一个按日期和产品类型划分的配置单元表

product_id, sale_id, date, product_type
42342423, 43423, 2017-01-01, S 
67867868, 23233, 2017-01-01, C 
53453466, 63423, 2017-02-01, S

我需要将产品类型的所有值从's'更新为't'(衬衫到上衣)。直接更新是不可能的,因为我们的配置单元版本不支持它。
像这样发布的其他解决方案包括创建一个新表并使用 insert overwritecase 陈述-类似于

INSERT OVERWRITE TABLE data.textile_sales PARTITION(date='2017-01-01')
select product_id, sale_id, case when product_type = 'S' then 'T' end as product_type, date

但是如果要更新的列是一个分区,这将不起作用。
还有别的办法吗?

tyu7yeag

tyu7yeag1#

分区列“data”实际上是与目录相关的元数据。
如果已经有“t”文件夹,则将文件从当前的“日期+产品”文件夹移动到相应的“日期+产品”文件夹。
如果没有“t”文件夹,只需重命名“s”文件夹并更新分区列表即可。
演示

hive> select * from product;
OK
67867868    23233   2017-01-01  C
42342423    43423   2017-01-01  S
53453466    63423   2017-01-02  S
[training@localhost ~]$ hdfs dfs -ls -R /user/hive/warehouse/product
drwxrwxrwx   - training hive          0 2017-01-10 13:35 /user/hive/warehouse/product/date=2017-01-01
drwxrwxrwx   - training hive          0 2017-01-10 13:36 /user/hive/warehouse/product/date=2017-01-01/product_type=C
-rwxrwxrwx   1 training hive         15 2017-01-10 13:36 /user/hive/warehouse/product/date=2017-01-01/product_type=C/000000_0
drwxrwxrwx   - training hive          0 2017-01-10 13:35 /user/hive/warehouse/product/date=2017-01-01/product_type=S
-rwxrwxrwx   1 training hive         15 2017-01-10 13:35 /user/hive/warehouse/product/date=2017-01-01/product_type=S/000000_0
drwxrwxrwx   - training hive          0 2017-01-10 13:36 /user/hive/warehouse/product/date=2017-01-02
drwxrwxrwx   - training hive          0 2017-01-10 13:36 /user/hive/warehouse/product/date=2017-01-02/product_type=S
-rwxrwxrwx   1 training hive         15 2017-01-10 13:36 /user/hive/warehouse/product/date=2017-01-02/product_type=S/000000_0
[training@localhost ~]$ hdfs dfs -mkdir /user/hive/warehouse/product/date=2017-01-01/product_type=T
[training@localhost ~]$ hdfs dfs -mkdir /user/hive/warehouse/product/date=2017-01-02/product_type=T
[training@localhost ~]$ hdfs dfs -mv /user/hive/warehouse/product/date=2017-01-01/product_type=S/000000_0 /user/hive/warehouse/product/date=2017-01-01/product_type=T/000000_0
[training@localhost ~]$ hdfs dfs -mv /user/hive/warehouse/product/date=2017-01-02/product_type=S/000000_0 /user/hive/warehouse/product/date=2017-01-02/product_type=T/000000_0
[training@localhost ~]$ hdfs dfs -ls -R /user/hive/warehouse/product
drwxrwxrwx   - training hive          0 2017-01-10 13:41 /user/hive/warehouse/product/date=2017-01-01
drwxrwxrwx   - training hive          0 2017-01-10 13:36 /user/hive/warehouse/product/date=2017-01-01/product_type=C
-rwxrwxrwx   1 training hive         15 2017-01-10 13:36 /user/hive/warehouse/product/date=2017-01-01/product_type=C/000000_0
drwxrwxrwx   - training hive          0 2017-01-10 13:42 /user/hive/warehouse/product/date=2017-01-01/product_type=S
drwxrwxrwx   - training hive          0 2017-01-10 13:42 /user/hive/warehouse/product/date=2017-01-01/product_type=T
-rwxrwxrwx   1 training hive         15 2017-01-10 13:35 /user/hive/warehouse/product/date=2017-01-01/product_type=T/000000_0
drwxrwxrwx   - training hive          0 2017-01-10 13:41 /user/hive/warehouse/product/date=2017-01-02
drwxrwxrwx   - training hive          0 2017-01-10 13:42 /user/hive/warehouse/product/date=2017-01-02/product_type=S
drwxrwxrwx   - training hive          0 2017-01-10 13:42 /user/hive/warehouse/product/date=2017-01-02/product_type=T
-rwxrwxrwx   1 training hive         15 2017-01-10 13:36 /user/hive/warehouse/product/date=2017-01-02/product_type=T/000000_0
hive> msck repair table product;
OK
Partitions not in metastore:    product:date=2017-01-01/product_type=T  product:date=2017-01-02/product_type=T
Repair: Added partition to metastore product:date=2017-01-01/product_type=T
Repair: Added partition to metastore product:date=2017-01-02/product_type=T
Time taken: 0.409 seconds, Fetched: 3 row(s)
hive> select * from product;
OK
67867868    23233   2017-01-01  C
42342423    43423   2017-01-01  T
53453466    63423   2017-01-02  T

相关问题