查询建议

vql8enpb  于 2021-07-29  发布在  Java
关注(0)|答案(2)|浏览(328)

如果有人能为重写下面的查询提供一些建议,那就太好了。我有一个有200万条记录的表,我需要找出两个独立键(id和dnm)上的重复项,如果它有重复项,那么需要删除它。我为它编写了两个单独的查询,如下所示。不管怎样,有没有办法使它成为一个查询,并以更好的方式重写它,因为这个查询确实需要时间。

DELETE FROM TABLEA
WHERE ID IN (SELECT ID FROM TABLEA WHERE TYPE <> 'A'
GROUP BY ID having count(*)>1) AND TYPE <> 'A'; 

DELETE FROM TABLEA WHERE DNM IN (SELECT DNM FROM TABLEA WHERE TYPE <> 'A'
GROUP BY DNM HAVING COUNT(*) >1) AND TYPE <> 'A';
jv4diomz

jv4diomz1#

您将希望对这两列使用行号,如下所示:

WITH deduplicate AS (
SELECT
   ID AS unique_id,
   DNM AS unique_dnm,
   ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) AS id_row,
   ROW_NUMBER() OVER (PARTITION BY DNM ORDER BY DNM) AS dnm_row
FROM
   TABLEA
) 
SELECT
   * INTO
     TABLEB
FROM
   TABLEA
      INNER JOIN
         deduplicate
      ON ID = unique_id
      AND DNM = unique_dnm
WHERE
   id_row = 1
AND
   dnm_row = 1
;

请注意,我建议您使用内部联接创建一个新表。这将消除重复的行,而无需执行两个单独的delete语句。

kxe2p93d

kxe2p93d2#

通常,要删除重复项,您需要使用这样的查询

DELETE FROM (
        SELECT
             PK_COLS
        ,    ROW_NUMBER() OVER(PARTITION BY PK_COLS) AS RN
        FROM
           YOUR_TABLE
    )
    WHERE RN > 1

它通常比其他方法快得多

相关问题