背景:我正在为一所大学做一个项目,将他们当前的设置(oracle)与hadoop/hive进行比较,并试图为我的报告重新创建一些表。
我正在尝试插入到使用配置单元创建的表中。我正在尝试插入在另一个表上使用select语句捕获的行。如果我只运行select语句,我会得到所有的行。但是,当我将此select放入insert语句时,出现了一个错误:
semanticexception[错误10002]:行102:11列引用“emplid”无效
这是我的create table/insert/select:
CREATE TABLE STG_HCM_PRE_COMP (
`JOB_DAYSEQ_KEY` DECIMAL(38,0)
, `DW_COMP_LN_CNT` Int
, `EMPLID` String
, `EMPL_RCD` Int
, `EFFDT` DATE
, `EFFSEQ` Int
, `COMP_EFFSEQ` Int
, `COMP_RATECD` String
, `COMP_RATECD_DESCR` String
, `COMP_RATECD2` String
, `COMP_RATECD_DESCR2` String
, `COMP_RATE_POINTS` Int
, `COMPRATE` FLOAT
, `UNITS` FLOAT
, `COMP_PCT` FLOAT
, `COMP_FREQUENCY` String
, `COMP_FREQUENCY_DESCR` String
, `CURRENCY_CD` String
, `MANUAL_SW` String
, `CONVERT_COMPRT` FLOAT
, `RATE_CODE_GROUP` String
, `RATE_CODE_GROUP_DESCR` String
, `CHANGE_AMT` FLOAT
, `CHANGE_PCT` FLOAT
, `CHANGE_PTS` Int
, `FTE_INDICATOR` String
, `CMP_SRC_IND` String
, `CMP_SRC_IND_DESCR` String
, `DW_JOB_ID` String
, `DW_JOBSEQ` BIGINT
, `DW_START_DATE` DATE
, `DW_END_DATE` DATE
, `DW_CURRENT_IND` String
, `DW_SOURCE_DB` String
, `DW_CF_YR` Int
, `DW_CREATED_EW_DTTM` DATE
, `DW_LASTUPD_EW_DTTM` DATE
);
INSERT INTO STG_HCM_PRE_COMP
(
job_dayseq_key
, dw_comp_ln_cnt
, emplid
, empl_rcd
, effdt
, effseq
, comp_effseq
, comp_ratecd
, comp_ratecd_descr
, comp_ratecd2
, comp_ratecd_descr2
, comp_rate_points
, comprate
, units
, comp_pct
, comp_frequency
, comp_frequency_descr
, currency_cd
, manual_sw
, convert_comprt
, rate_code_group
, rate_code_group_descr
, change_amt
, change_pct
, change_pts
, fte_indicator
, cmp_src_ind
, cmp_src_ind_descr
, dw_job_id
, dw_jobseq
, dw_start_date
, dw_end_date
, dw_current_ind
, dw_source_db
, dw_cf_yr
, dw_created_ew_dttm
, dw_lastupd_ew_dttm
)
SELECT
cast(CONCAT(date_format(c.effdt,'YYYYMMdd'), lpad(cast(cast(c.effseq as int) as string), 4, '0'), c.EMPLID, lpad(cast(cast(c.EMPL_RCD as int) as string), 4, '0')) as decimal(38,0)) as job_dayseq_key
, count(*) over (partition by c.emplid,c.empl_rcd, c.effdt, c.effseq) as dw_comp_ln_cnt
, c.EMPLID
, c.EMPL_RCD
, c.EFFDT
, c.EFFSEQ
, c.COMP_EFFSEQ
, c.COMP_RATECD
, '-'
, '-'
, '-'
, c.COMP_RATE_POINTS
, c.COMPRATE
, cast(null as int)
, c.COMP_PCT
, c.COMP_FREQUENCY
, '-'
, c.CURRENCY_CD
, c.MANUAL_SW
, c.CONVERT_COMPRT
, c.RATE_CODE_GROUP
, '-'
, c.CHANGE_AMT
, c.CHANGE_PCT
, c.CHANGE_PTS
, c.FTE_INDICATOR
, c.CMP_SRC_IND
, NVL(X1.XLATLONGNAME,'-')
, CONCAT(c.EMPLID, '-', lpad(cast(cast(c.EMPL_RCD as int) as string), 4, '0')) as dw_job_id
, dense_rank() over (partition by c.emplid,c.empl_rcd order by c.effdt desc,c.effseq desc) as dw_jobseq
, c.EFFDT as dw_start_date
, LEAD(c.EFFDT,1,TO_DATE('2099-6-30')) OVER (PARTITION BY c.EMPLID, c.EMPL_RCD ORDER BY c.effdt, c.effseq) as dw_end_date
, '-'
, 'DW.DB_CMS_HR'
, CASE D_DAT.FISCAL_YEAR
when 2005 then D_DAT.FISCAL_YEAR
when 2006 then D_DAT.FISCAL_YEAR
when 2007 then D_DAT.FISCAL_YEAR
when 2008 then D_DAT.FISCAL_YEAR
when 2009 then D_DAT.FISCAL_YEAR
when 2010 then D_DAT.FISCAL_YEAR
when 2011 then D_DAT.FISCAL_YEAR
when 2012 then D_DAT.FISCAL_YEAR
when 2013 then D_DAT.FISCAL_YEAR
when 2014 then D_DAT.FISCAL_YEAR
else 2005 end
, from_unixtime(unix_timestamp())
, from_unixtime(unix_timestamp())
FROM PS_COMPENSATION C
,DW.DIM_DATE D_DAT
,STG_HCM_MAXEFFDT_XLAT X1
WHERE
c.effdt = d_dat.date_id
AND c.comp_ratecd != 'UNITS'
AND X1.FIELDNAME = 'CMP_SRC_IND'
AND X1.FIELDVALUE = C.CMP_SRC_IND
AND c.emplid not like 'DUP%'
order by c.EMPLID, c.EMPL_RCD, c.EFFDT, c.EFFSEQ;
任何关于为什么只有当select语句与insert语句结合使用时查询才会失败的见解都将不胜感激!
我看到其他几个线程引用了这个问题,其中作者没有在orderbybuy中包含足够的列。我通过sqldeveloper确认,下面的查询没有返回行,这意味着groupby对于select是有效的。
select emplid,empl_rcd,effdt, EFFSEQ, count(*)
from stg_hcm_pre_comp
group by emplid,empl_rcd,effdt,EFFSEQ
having count(*) > 1;
1条答案
按热度按时间toe950271#
insert语句中对列列表规范的支持是从hive1.2.0开始的
在早期版本中,应该按列在目标表中的顺序插入所有列
https://issues.apache.org/jira/browse/hive-9481