SQL Server How to group entities that have the same subset of rows in another table

sr4lhrrt  于 2023-10-15  发布在  其他
关注(0)|答案(2)|浏览(114)

I have tables Orders (Id), Lot(Id, Name) and OrdersLot (IdOrder, IdLot). An order can accept multiple lots, and the many to many relationship is done with the OrdersLot table. How can I group the Orders that share the same exact lots?

Data Example:
| IdOrder |
| ------------ |
| 1 |
| 2 |
| 3 |

IdLot
A
B
C
IdOrderIdLot
------------------------
1A
1B
2B
2C
2A
3B

Based on this data, I would need to generate a unique value that is the same for Order 1 and 3 (since they both have lot A and B), and different for Order 2.

The reason is that the logic I need to extend have a while loop that groups the orders by same properties, and then cycle in a while loop each OrderGroup doing queries only on the orders having those properties.

Any suggestions?

An option I considered is removing the OrdersLot table and saving a string with all the lots name in the Order table, but this definitely doesn't seem like the most performant way.

Maybe the solution could be generating a unique value for each subset of lot (using their ids for example)?

7jmck4yq

7jmck4yq1#

If you just need a unique value for sets that look the same, I'd be tempted to just create that from the unique set of values. If they're integer keys, then they're unique, and you could do something as simple as string_agg ing them together. Despite SQL being all about sets, doing set equivalence stuff like this is kind of a challenge.

Here's what I'd do. I created some junk data below. The first two queries are guaranteed to make some duplicates there. The last set is guaranteed to not.

The statement with the string_agg is the one your after.

drop table if exists #Orders
create table #Orders
(
    OrderID int
)

insert into #Orders (OrderID)
select top 10 row_number() over (order by (select null))
from sys.all_objects

drop table if exists #Lot
create table #Lot
(
    LotId int,
    Name nvarchar(128)
)

insert into #Lot (LotID, Name)
select top 20 row_number() over (order by (select null)), newid()
from sys.all_objects

drop table if exists #LotOrders
create table #LotOrders
(
    OrderLotID int identity(1,1),
    OrderId int,
    LotID int
)

insert into #LotOrders
(
    OrderId,
    LotID
)
-- Some guaranteed identical sets
select
    o.OrderId,
    l.LotID
from #Orders o
inner join #Lot l
    on l.LotID < 4
where o.OrderID < 3
union all
select
    o.OrderId,
    l.LotID
from #Orders o
inner join #Lot l
    on l.LotID < 6
where o.OrderID between 3 and 5
union all
-- Some unique sets
select
    o.OrderId,
    l.LotId
from #Orders o
inner join #Lot l
    on l.LotId % 9 = o.OrderID
where o.OrderID > 5
order by o.OrderID

select OrderID, GroupId = string_agg(LotId, '|') within group (order by LotId)
from #LotOrders
group by OrderId
order by GroupId
aij0ehis

aij0ehis2#

You can use STRING_AGG() to build up an ordered list of Lot IDs for each Order ID. The DENSE_RANK() function can then be used to assign distinct group IDs for each set of matching orders.

Something like:

SELECT
    DENSE_RANK() OVER(ORDER BY LotIdList) AS OrderGroupId,
    A.*
FROM (
    SELECT
        OL.IdOrder,
        STRING_AGG(OL.IdLot, ',') WITHIN GROUP(ORDER BY OL.IdLot) AS LotIdList
    FROM OrdersLot OL
    GROUP BY OL.IdOrder
) A
ORDER BY OrderGroupId, A.IdOrder

Sample results:

OrderGroupIdIdOrderLotIdList
11A,B
13A,B
22B,C

Alternately, you can just group by the constructed list as part of a larger query:

SELECT
    A.LotIdList,
    COUNT(*) AS OrderCount,
    STRING_AGG(A.IdOrder, ',') AS OrderIdList
FROM (
    SELECT
        OL.IdOrder,
        STRING_AGG(OL.IdLot, ',') WITHIN GROUP(ORDER BY OL.IdLot) AS LotIdList
    FROM OrdersLot OL
    GROUP BY OL.IdOrder
) A
GROUP BY A.LotIdList
ORDER BY A.LotIdList

Results:

LotIdListOrderCountOrderIdList
A,B21,3
B,C12

See This DB<>fiddle for a demo. (Updated for recently posted sample data.)

相关问题