postgresql 即使在distinct关键字之后,结果集中仍存在重复记录

kxeu7u2r  于 2023-03-12  发布在  PostgreSQL
关注(0)|答案(3)|浏览(203)

我订了一张如下的table。

entity_id|effective_date|value|
A        |2023-09-09    |234  |
A        |2023-09-06    |345  |
B        |2023-09-02    |341  |
C        |2023-09-01    |347  |

我想查找具有最大有效日期及其各自值的所有唯一实体ID。我正在尝试以下查询。

select distinct entity_id, value, max(effective_date) start_date
from refdata.investment_raw ir
where attribute_id  = 232
  and entity_id in (select invest.val as investment_id  
                    from refdata.ved soi
                    inner join refdata.ved invest
                      on soi.entity_id = invest.entity_id 
                     and current_date  between invest.start_date and invest.end_date
                     and invest.attribute_code = 'IssuerId'
                     and soi.attribute_code = 'SO'
                     and  soi.val in ('1','2')
                     and current_date between soi.start_date and soi.end_date)
group by entity_id, value

有了这个,我在结果集中得到了以下内容。

entity_id|effective_date|value|
    A        |2023-09-09    |234  |
    A        |2023-09-06    |345  |
    B        |2023-09-02    |341  |
    C        |2023-09-01    |347  |

预期结果集为

entity_id|effective_date|value|
    A        |2023-09-09    |234  |
    B        |2023-09-02    |341  |
    C        |2023-09-01    |347  |

在预期结果集中,您可以看到最大有效日期和唯一实体ID及其各自的值。在实际结果集中,我得到的实体ID A是重复记录。当我从查询和group by子句中删除distinct值时,我得到了预期结果,但没有value列。我还希望得到各自的distinct实体ID值,但没有重复。我的查询出了什么问题?

nhaq1z21

nhaq1z211#

看起来您希望启用PostgreSQL特定的DISTINCT,这与DISTINCT不同。

select distinct on (entity_id) entity_id, value, effective_date as start_date
from ...
ORDER BY entity_id, effective_date desc
zwghvu4y

zwghvu4y2#

WITH YOUR_TABLE_DATA(ENTITY_ID,EFFECTIVE_DATE,VALUE) AS
(
   SELECT 'A','2023-09-09'::DATE,234 UNION ALL 
   SELECT 'A','2023-09-06'::DATE,345 UNION ALL
   SELECT 'B','2023-09-02'::DATE,341 UNION ALL 
   SELECT 'C','2023-09-01'::DATE,347
)
SELECT SQ.ENTITY_ID,SQ.EFFECTIVE_DATE,SQ.VALUE
FROM
 (
    SELECT C.ENTITY_ID,C.EFFECTIVE_DATE,C.VALUE,
     ROW_NUMBER()OVER(PARTITION BY C.ENTITY_ID ORDER BY C.EFFECTIVE_DATE DESC)AS XCOL
    FROM YOUR_TABLE_DATA AS C
)SQ WHERE SQ.XCOL=1

您可以使用ROW_NUMBER过滤方法

ndh0cuux

ndh0cuux3#

我们首先使用group by生成一个实体列表及其最大有效日期,然后使用此列表连接表:

select t.*
from mytable t
inner join (
  select entity_id, max(effective_date) as max_effective_date
  from mytable
  group by entity_id
) as s on s.entity_id = t.entity_id and s.max_effective_date = t.effective_date

Demo here

相关问题