如何在配置单元插入查询中忽略输入时开始

dwbf0jvd  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(318)

我的数据格式用制表符分隔state:ca city:加利福尼亚population:1m
我想创建db,当我插入时,我应该忽略“state:”、“city:”和“poulation”,我想将state插入到state数据库中,并将city插入到city表中。
将有两个表,然后一个与国家和人口的另一个与城市和人口

CREATE EXTERNAL TABLE IF NOT EXISTS CSP.original 
(
    st STRING COMMENT 'State', 
    ct STRING COMMENT 'City', 
    po STRING COMMENT 'Population'
) 
COMMENT 'Original Table' 
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'

这不管用。它添加了评论,但没有置之不理。我还想为州和城市创建两个表。有人能帮我吗?

iswrvxsc

iswrvxsc1#

您必须首先创建外部表。
第一步:

CREATE EXTERNAL TABLE all_info (state STRING, population INT) PARTITIONED BY (date STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t;

第二步:

CREATE TABLE IF NOT EXISTS state (state string, population INT) PARTITIONED BY (date string);
CREATE TABLE IF NOT EXISTS city (city string, population INT) PARTITIONED BY (date string);

第三步:

INSERT OVERWRITE TABLE state
PARTITION (date = ‘201707076’)
SELECT *
FROM all_info
WHERE date = ‘20170706’ AND
              instr(state, ‘state:’) = 1;  
INSERT OVERWRITE TABLE city
PARTITION (date = ‘201707076’)
SELECT *
FROM all_info
WHERE date = ‘20170706’ AND
              instr(state, ‘city:’) = 1;

相关问题