oracle 将MAX(KEEP(DENSE_RANK LAST ORDER BY,CASE)语句转换为Spark SQL

km0tfn4u  于 2023-10-16  发布在  Oracle
关注(0)|答案(1)|浏览(93)

我很难将这些Oracle代码转换为Spark SQL。

SELECT STUDY_ID, POSITION_ID, STATUS
FROM
(SELECT t2.STUDY_ID, t2.POSITION_ID
, (SELECT MAX(
    t1.CODE || '~' || DATE || '~' || COMMENT)
    KEEP (
    DENSE_RANK LAST
    ORDER BY
    t1.DATE
    ,
    CASE 
    WHEN t1.CODE IN ('READY TO BEGIN') THEN 1
    WHEN t1.CODE IN ('IN PROGRESS') THEN 2
    WHEN t1.CODE IN ('FINISHED') THEN 3
    ELSE 10
    END
    )
FROM TABLE1 t1
WHERE t2.POSITION_ID = t1.POSITION_ID
AND t1.CODE IS NOT NULL
) STATUS
FROM TABLE2 t2,
TABLE3 t3
WHERE t2.STUDY_ID = t3.STUDY_ID
)
WHERE STATUS IS NOT NULL

到目前为止,我已经尝试将CASE WHEN之前的部分替换为PARTITION BY,ORDERED BY t1.DATE DESC,因为DENSE_RANK指定LAST。我还尝试了不同的窗口函数,它们产生的结果与基本查询不同。我对SQL世界相当陌生,我发现这种转换真的很令人困惑。我试图向AI寻求帮助,但它不太明白这个查询试图实现什么。结果应该是:多个POSITION_ID用于多个STUDY_ID,每个STUDY_ID具有其最新的STATUS,例如:
| 研究ID|位置_ID|地位|
| --|--|--|
| 123 | 1111 |准备开始~01.01.2000~开始|
| 123 | 1112 |完成日期:2000年1月2日|

quhf5bfb

quhf5bfb1#

您可以将Oracle查询重写为:

SELECT t2.study_id,
       t2.position_id,
       t1.status
FROM   TABLE2 t2
       INNER JOIN TABLE3 t3
       ON t2.STUDY_ID = t3.STUDY_ID
       INNER JOIN (
         SELECT position_id,
                code || '~' || date || '~' || comment AS status,
                ROW_NUMBER() OVER (
                  PARTITION BY position_id
                  ORDER BY
                    date DESC,
                    CASE code
                    WHEN 'READY TO BEGIN' THEN 1
                    WHEN 'IN PROGRESS'    THEN 2
                    WHEN 'FINISHED'       THEN 3
                                          ELSE 10
                    END DESC,
                    comment DESC
                ) AS rn
         FROM   TABLE1
         WHERE  code IS NOT NULL
       ) t1
       ON (t1.position_id = t2.position_id AND t1.rn = 1);

我不使用Apache Spark,但Spark SQL似乎支持CASE表达式和ROW_NUMBER分析函数,因此您可以使用该查询(或者您可能还需要将code || '~' || date || '~' || comment转换为CONCAT(code, '~', date, '~', comment))。

  • 注意:在这个版本的查询中,status永远不会是NULL,因为它包含'~'字面值,而这永远不会是NULL

相关问题