db2 递归SQL树优化

aurhwmvo  于 2022-12-26  发布在  DB2
关注(0)|答案(1)|浏览(197)

我目前正在处理我的Db2数据库的性能问题。我写了一个SQL,它从一个源表构建一个完整的树。在结果中,你可以看到从父到子,孙子等所有可能的方式。

CREATE TABLE SOURCE_TABLE (
            PARENT_ID VARCHAR(12),
            CHILD_ID VARCHAR(12)
);
 
INSERT INTO SOURCE_TABLE (PARENT_ID, CHILD_ID) VALUES (NULL, 'A');
INSERT INTO SOURCE_TABLE (PARENT_ID, CHILD_ID) VALUES (NULL, 'B');
INSERT INTO SOURCE_TABLE (PARENT_ID, CHILD_ID) VALUES (NULL, 'C');
INSERT INTO SOURCE_TABLE (PARENT_ID, CHILD_ID) VALUES ('A', 'A_1');
INSERT INTO SOURCE_TABLE (PARENT_ID, CHILD_ID) VALUES ('A_1', 'A_1_2');
INSERT INTO SOURCE_TABLE (PARENT_ID, CHILD_ID) VALUES ('B', 'B_1');

| 母体ID|子项ID|
| - ------| - ------|
| (无)|A类|
| (无)|乙|
| (无)|C级|
| A类|甲_1|
| 甲_1|甲_1_2|
| 乙|地下一层|
我的密码是:

WITH PARENTS(ID) AS
(
    SELECT DISTINCT PARENT_ID
    FROM SOURCE_TABLE
    WHERE PARENT_ID IS NOT NULL
),
LVL_TREE(LVL, DIM_PARENT_ID, DIM_CHILD_ID) AS
(
     SELECT 1, PARENT_ID, CHILD_ID
     FROM SOURCE_TABLE
     WHERE PARENT_ID IS NULL
     UNION ALL
     SELECT L.LVL + 1, Q.PARENT_ID, Q.CHILD_ID
     FROM LVL_TREE L, SOURCE_TABLE Q
     WHERE Q.PARENT_ID = L.DIM_CHILD_ID
),         
TREE(DIM_PARENT_ID, DIM_CHILD_ID, DIM_PARENT_LVL, DIM_CHILD_LVL, DIM_LVL, DIM_TYPE) AS
(
    SELECT DIM_PARENT_ID, DIM_CHILD_ID, LVL-1, LVL, 1,
    CASE WHEN DIM_CHILD_ID IN (SELECT ID FROM PARENTS) THEN 'S' ELSE 'B' END AS DIM_TYPE
    FROM LVL_TREE
    UNION ALL
    SELECT T.DIM_PARENT_ID, Q.CHILD_ID, T.DIM_PARENT_LVL, T.DIM_PARENT_LVL + T.DIM_LVL + 1, T.DIM_LVL + 1,
    CASE WHEN Q.CHILD_ID IN (SELECT ID FROM PARENTS) THEN 'S' ELSE 'B' END AS DIM_TYPE
    FROM TREE T, SOURCE_TABLE Q
    WHERE T.DIM_PARENT_ID IS NOT NULL AND Q.PARENT_ID = T.DIM_CHILD_ID
    UNION ALL
    SELECT DIM_CHILD_ID, DIM_CHILD_ID, LVL, LVL, 0, 'B'
    FROM LVL_TREE WHERE DIM_CHILD_ID NOT IN (SELECT ID FROM PARENTS)
)
SELECT
    DIM_PARENT_ID,
    DIM_CHILD_ID,
    DIM_PARENT_LVL,
    DIM_CHILD_LVL,
    DIM_LVL,
    DIM_TYPE
FROM TREE
WHERE DIM_PARENT_ID IS NOT NULL

我的代码工作完美,并显示正确的结果:
| 尺寸父项ID|尺寸_子项_ID|尺寸_父级_级别|尺寸子级|尺寸_标高|尺寸类型|
| - ------| - ------| - ------| - ------| - ------| - ------|
| A类|甲_1| 1个|第二章|1个|S型|
| A类|甲_1_2| 1个|三个|第二章|乙|
| 甲_1|甲_1_2|第二章|三个|1个|乙|
| 甲_1_2|甲_1_2|三个|三个|无|乙|
| 乙|地下一层|1个|第二章|1个|乙|
| 地下一层|地下一层|第二章|第二章|无|乙|
| C级|C级|1个|1个|无|乙|
问题是,我的SOURCE_TABLE有超过150万行,甚至在20个小时后我还是没有得到任何结果。有什么优化建议吗?谢谢!

9bfwbjaz

9bfwbjaz1#

源表真的只有这些吗?

CREATE TABLE SOURCE_TABLE (
    PARENT_ID VARCHAR(12),
    CHILD_ID VARCHAR(12)
);

我的意思是没有索引,没有键,所有的东西都可以是空的,等等?
生成一个小树,如:

insert into source_table (child_id, parent_id) 
with gen(n) as ( values 1 
                 union all 
                 select n+1 from gen where n<10000 
) select n::varchar(12)
       , case when n<>sqrt(n) then sqrt(n)::int::varchar(12) end 
  from gen

然后运行您的查询需要:

real    0m38,704s
user    0m0,232s
sys     0m0,339s

添加两个索引,如:

create index x1 on SOURCE_TABLE(PARENT_ID, CHILD_ID) 
allow reverse scans 
collect sampled detailed statistics

create index x2 on SOURCE_TABLE(CHILD_ID, PARENT_ID) 
allow reverse scans 
collect sampled detailed statistics

改善情况:

real    0m0,983s
user    0m0,226s
sys     0m0,321s

但你应该重新考虑你的设计

CREATE TABLE SOURCE_TABLE (
    CHILD_ID VARCHAR(12) NOT NULL PRIMARY KEY
    PARENT_ID VARCHAR(12)
        REFERENCES SOURCE_TABLE (CHILD_ID)
);

是一个开始。如果你想更进一步,可以考虑把你的table分成两部分,比如:

CREATE TABLE TREE_NODES
( NODE_ID VARCHAR(12) NOT NULL 
, CONSTRAINT PK_NODES PRIMARY KEY (NODE_ID)
);

CREATE TABLE TREE_STRUCTURES
( NODE_ID VARCHAR(12) NOT NULL
, PARENT_NODE_ID VARCHAR(12) NOT NULL
);

CREATE UNIQUE INDEX PK_TREE_STRUCTURE ON TREE_STRUCTURES
    (NODE_ID)
INCLUDE (PARENT_NODE_ID)
ALLOW REVERSE SCANS
COLLECT SAMPLED DETAILED STATISTICS;

CREATE INDEX X01_TREE_STRUCTURE ON TREE_STRUCTURES
    (PARENT_NODE_ID, NODE_ID)
ALLOW REVERSE SCANS
COLLECT SAMPLED DETAILED STATISTICS;

ALTER TABLE TREE_STRUCTURES ADD CONSTRAINT PK_TREE_STRUCTURE 
    PRIMARY KEY (NODE_ID);

ALTER TABLE TREE_STRUCTURES ADD CONSTRAINT FK1_NODES 
      FOREIGN KEY (NODE_ID) REFERENCES TREE_NODES (NODE_ID);

ALTER TABLE TREE_STRUCTURES ADD CONSTRAINT FK2_NODES
      FOREIGN KEY (PARENT_NODE_ID) REFERENCES TREE_STRUCTURES (NODE_ID);

您可能也可以改进查询,但是在不知道结果表示什么的情况下,很难判断结果

相关问题