我的问题是关于性能和cte在运行时得到评估的方式。
我计划通过定义一个基本投影来重用代码,然后用不同的过滤器在这个基本投影之上定义多个cte。
这是否会导致任何性能问题。更具体地说,是否每次都会对基础投影进行评估。
例如:
WITH CTE_PERSON as (
SELECT * FROM PersonTable
),
CTE_PERSON_WITH_AGE as (
SELECT * FROM CTE_PERSON WHERE age > 24
),
CTE_PERSON_WITH_AGE_AND_GENDER as (
SELECT * FROM CTE_PERSON_WITH_AGE WHERE gender = 'm'
),
CTE_PERSON_WITH_NAME as (
SELECT * FROM CTE_PERSON WHERE name = 'abc'
)
是否每次persontable中的所有条目都会加载到内存中,然后应用过滤器(或)
只有筛选后的结果集才会加载到内存中。
1条答案
按热度按时间pinkon5k1#
一次扫描。
注:
-单级
-单身汉
TableScan
-predicate: (((i = 1) and (j = 2)) and (k = 3)) (type: boolean)
```create table t (i int,j int,k int);
explain
with t1 as (select i,j,k from t where i=1)
,t2 as (select i,j,k from t1 where j=2)
,t3 as (select i,j,k from t2 where k=3)
select * from t3
;
Explain
STAGE DEPENDENCIES:
Stage-0 is a root stage
STAGE PLANS:
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
TableScan
alias: t
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Filter Operator
predicate: (((i = 1) and (j = 2)) and (k = 3)) (type: boolean)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Select Operator
expressions: 1 (type: int), 2 (type: int), 3 (type: int)
outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
ListSink