sql中的列名重复(impala)

35g0bw71  于 2021-06-26  发布在  Impala
关注(0)|答案(1)|浏览(584)

我的sql返回错误 AnalysisException: Duplicate column name: all_periods_int :

DROP TABLE IF EXISTS temp_nielsen_other_upc_all_markets_test;

CREATE TABLE IF NOT EXISTS temp_nielsen_other_upc_all_markets_test
PARTITIONED BY (all_periods_int)
STORED AS PARQUET
AS
SELECT          
                    COALESCE(gap.all_markets,fct.all_markets) AS all_markets,
                    COALESCE(gap.all_periods, fct.all_periods) AS all_periods,
                    NVL(gap.dollar,0) - NVL(fct.dollar,0) AS dollar,
                    (CAST(SUBSTR(COALESCE(gap.all_periods,fct.all_periods),7,2) AS int)+2000)*10000+CAST(SUBSTR(COALESCE(gap.all_periods,fct.all_periods),1,2) AS int)*100+CAST(SUBSTR(COALESCE(gap.all_periods,fct.all_periods),4,2) AS int) AS all_periods_int
FROM                temp_nielsen_gap_total_all_markets_raw gap
FULL OUTER JOIN(

                    SELECT
                                        all_markets,
                                        all_periods,
                                        SUM(dollar) AS dollar
                    FROM                temp_nielsen_sku_facts_all_markets_raw
                    GROUP BY            all_markets,
                                        all_periods
                ) fct
ON(                 gap.all_markets = fct.all_markets AND
                    gap.all_periods = fct.all_periods
    )
WHERE           ABS(NVL(gap.dollar,0) - NVL(fct.dollar,0)) > 1 AND
                fct.all_markets in (SELECT DISTINCT all_markets FROM temp_nielsen_gap_total_all_markets_raw);

all_periods_int 是由这个sql创建的,它不存在于两个基础表中。
另外,另一件奇怪的事情是,以下两种情况都很好地工作:
在不带第二个条件的情况下运行完整语句 WHERE 声明 fct.all_markets in (SELECT DISTINCT all_markets FROM temp_nielsen_gap_total_all_markets_raw) ;
只需运行 SELECT 不带 CREATE TABLE 声明。
我哪儿也没看见 all_periods_int 会被复制。

c86crjj0

c86crjj01#

所以问题是,分区变量只能在partition by子句中引用,而不能在select语句中引用。但是,在select中生成变量时,需要嵌套此步骤:

DROP TABLE IF EXISTS temp_nielsen_other_upc_all_markets_test;

       CREATE TABLE IF NOT EXISTS temp_nielsen_other_upc_all_markets_test
       PARTITIONED BY (all_periods_int)
       STORED AS PARQUET
       AS
        SELECT
       all_markets
       , all_periods
, dollars
FROM
(
SELECT          
                    COALESCE(gap.all_markets,fct.all_markets) AS all_markets,
                    COALESCE(gap.all_periods, fct.all_periods) AS all_periods,
                    NVL(gap.dollar,0) - NVL(fct.dollar,0) AS dollar,
                    (CAST(SUBSTR(COALESCE(gap.all_periods,fct.all_periods),7,2) AS int)+2000)*10000+CAST(SUBSTR(COALESCE(gap.all_periods,fct.all_periods),1,2) AS int)*100+CAST(SUBSTR(COALESCE(gap.all_periods,fct.all_periods),4,2) AS int) AS all_periods_int
FROM                temp_nielsen_gap_total_all_markets_raw gap
FULL OUTER JOIN(

                    SELECT
                                        all_markets,
                                        all_periods,
                                        SUM(dollar) AS dollar
                    FROM                temp_nielsen_sku_facts_all_markets_raw
                    GROUP BY            all_markets,
                                        all_periods
                ) fct
ON(                 gap.all_markets = fct.all_markets AND
                    gap.all_periods = fct.all_periods
    )
WHERE           ABS(NVL(gap.dollar,0) - NVL(fct.dollar,0)) > 1 AND
                fct.all_markets in (SELECT DISTINCT all_markets FROM temp_nielsen_gap_total_all_markets_raw)) t_final;

相关问题