postgresql 如何通过最佳位置获得独特的结果

bihw5rsg  于 2022-11-04  发布在  PostgreSQL
关注(0)|答案(2)|浏览(138)

我有一个很大的PostgreSQL数据库表。我需要从这个表中取出按Car_idposition列分组的行。问题是我有很多重复项,需要取出一行最好的position。我写了一个sql示例,它给了我正确的结果,但它需要修改。或者我如何以一种更干净的方式来做呢?
我需要选择一个唯一的car_id,它有一个最小的位置,最后一个日期,在所有通过的车牌号码中,我对具体的车牌号码不感兴趣。
SQL示例:

select
    "eventDate",
    "Car_id",
    min("position") as "carPosition",
    groupArray(concat(toString("scrapedAt"), '_', toString("position"))) as "scrapedAtByPosition",
    groupArray(concat("licensePlate", '_', toString("position"))) as "licensePlateByPosition",
    groupArray(concat(toString("amazonChoice"), '_', toString("position"))) as "amazonChoicesByPosition",
    'organic' as "matchType"
from "Car1_ScrapeHistoryLicensePlate"
         inner join (
             select "Car_id", max("scrapedAt") as "scrapedAt"
             from "Car1_ScrapeHistoryLicensePlate"
             where "licensePlate" IN ('ALPR912', 'JGPD831') and "eventDate" between '2022-08-12' and '2022-09-12'
             group by "Car_id", "eventDate"
         ) as t1 USING ("Car_id", "scrapedAt")
where "licensePlate" IN ('ALPR912', 'JGPD831') and "eventDate" between '2022-08-12' and '2022-09-12'
group by "eventDate", "Car_id"
order by "eventDate" desc;

数据库记录:

eventDate  Car_id  licensePlate position scrapedAt
---------- ------  ------------ -------  --------- 
2022-09-10,   1,   APRJSC512,    1,     1660000001
2022-09-10,   1,   APRJSC512,    1,     1660000002
2022-09-10,   1,   PLBQWN035,    1,     1660000003
2022-09-10,   1,   PLBQWN035,    1,     1660000004
2022-09-10,   1,   PLBQWN035,    2,     1660000002
2022-09-11,   2,   APRJSC512,    1,     1660000011
2022-09-11,   2,   APRJSC512,    2,     1660000022
2022-09-11,   2,   PLBQWN035,    1,     1660000033
2022-09-11,   2,   PLBQWN035,    2,     1660000044
2022-09-11,   2,   PLBQWN035,    5,     1660000022
2022-09-12,   3,   APRJSC512,    3,     1660000111
2022-09-12,   3,   PLBQWN035,    3,     1660000222
2022-09-13,   4,   PLBQWN035,    4,     1660001111
2022-09-14,   5,   PLBQWN035,    5,     1660011111

预期结果:

eventDate  Car_id  licensePlate position scrapedAt
---------- ------  ------------ -------  ---------
2022-09-10,   1,   PLBQWN035,    1,     1660000004
2022-09-11,   2,   PLBQWN035,    1,     1660000033
2022-09-12,   3,   PLBQWN035,    3,     1660000222
sshcrbum

sshcrbum1#

在PostgreSQL中,您可以使用brilliant distinct on
表达式的order by列表确定要为每个car_id选取的记录。对于具有相同car_id的每个组,将选取第一个记录。

select distinct on (car_id) * -- or the relevant expression list here
from the_table
order by car_id, position, scrapedat desc;

DB-fiddle

hvvq6cgz

hvvq6cgz2#

select  eventDate
       ,Car_id 
       ,licensePlate 
       ,position 
       ,scrapedAt
from
(
select  *
       ,row_number() over(partition by car_id order by position, scrapedat desc) as rn
from    t
) t
where   rn = 1

| 事件日期|汽车标识|牌照|位置|刮擦|
| - -|- -|- -|- -|- -|
| 2022年9月10日|一个|PLBQWN 035程序包|一个|小行星1660000004|
| 2022年9月11日|2个|PLBQWN 035程序包|一个|小行星1660000033|
| 2022年9月12日|三个|PLBQWN 035程序包|三个|小行星1660000222|
| 2022年9月13日|四个|PLBQWN 035程序包|四个|小行星166000|
| 2022年9月14日|五个|PLBQWN 035程序包|五个|小行星16600|
Fiddle

相关问题