我正在从运行了20多分钟的几个大表的查询开发数据集市。为了使代码更具可读性并避免使用幻数和单词,我使用hivevarMap,而不是常量:
select b.id1 as id1,
-- near 30 fields with nested if, match case etc, like
if( d7.recordid = '01'
or d7.recordid = '02'
or -- ... 10 expressions
'expression1',
'expression2'
) as trans_type,
b.id2 as id2
此查询(没有配置单元Map)在13分钟内执行。
请告诉我如何改进这个查询,并避免里面的幻数,使它更快。
set hivevar:transaction_class_code=map('INTERNAL', '01',
'INTERNATIONAL', '02',
-- 10 elements
);
select b.id1 as id1,
-- near 30 fields with nested if, match case etc, like
if( d7.recordid = ${transaction_class_code}['INTERNAL']
or d7.recordid = ${transaction_class_code}['INTERNATIONAL']
or -- ... 10 expressions
'expression1',
'expression2'
) as trans_type,
b.id2 as id2
from banks b -- ~10_000_000_000 rows, ~100 columns
join trans t -- 10_000_000 rows
on b.id1 = t.id1
and b.id2 = t.id2
left join customer c -- 3_000_000_000 rows
on b.customer_id = c.transaction_customer_id
and b.id1 = c.bank_id
left join transactions_pa tp -- 500_000_000 rows
on b.id1 = tp.bank_id
and b.id2 = tp.transaction_id
left join transaction_dict td -- 100_000_000
on b.id1 = td.bank_id
and b.id2 = td.transaction_id
left join dict1 d1 --5_152
on b.instance = d1.instance
and b.transaction_dict1_id = d1.recid
-- 4 dictionaries with 5 - 10 columns and 500 - 3000 rows
left join dictionary6 d7 -- 2116
on b.instance = d6.instance
and b.transaction_dict6_id = d6.recid;
暂无答案!
目前还没有任何答案,快来回答吧!