SQL Server Check ID's in one subquery and if not returned check in another

lymgl2op  于 11个月前  发布在  其他
关注(0)|答案(4)|浏览(114)

We have two SQL Server tables, dbo.parent and dbo.child . It is a one-to-many relationship with one parent having many child elements. Each child element has a unique name and effectiveTimestamp value.

The current working query below will take an array of known parentid's and for each find the current effective child from dbo.child . The effective child is calculated by getting the child with the max effectiveTimestamp that is still less than the current date/time. This query works fine.

It's possible for a parent to have only child elements with a future dated effectiveTimestamp (greater than current date/time). In this scenario, the above query won't return anything for those parentid's. Currently we manage this by running a different, 2nd query for any parentid's that aren't returned in the first query, however I feel this is unnecessary and want to include everything under one SQL query.

Therefore I'm looking to amend the query below, so if the parentid does not have an effective child then it looks for the max effectiveTimestamp regardless whether or not is is less than the current date/time.

SELECT 
    parentid,
    child.id,
    name,
    child.effectivetimestamp
FROM
    dbo.child AS child
INNER JOIN 
    (SELECT 
         Max([id]) AS id,
         Max([effectivetimestamp]) AS effectiveTimestamp
     FROM   
         dbo.child
     WHERE  
         parentid IN (1, 2, 3, 4, 5)
         AND effectivetimestamp <= GETDATE()
     GROUP BY 
         parentid) AS effectiveVersion ON child.id = effectiveVersion.id

ANSWER

Thanks everyone for the response, I received multiple solutions that I believe would have worked. In the end I used row_number() as suggested by siggemannen. Full solution below:

SELECT parentid,
       child.id,
       name,
       child.effectivetimestamp
FROM   dbo.child AS child
       INNER JOIN (SELECT *
                   FROM   (SELECT *,
                                  Row_number()
                                    OVER(
                                      partition BY parentid
                                      ORDER BY CASE WHEN effectivetimestamp <=
                                    Getdate(
                                    ) THEN 0
                                    ELSE 1 END
                                    ,
                                    effectivetimestamp DESC) AS 'rnk'
                           FROM   dbo.child) AS c
                   WHERE  parentid IN ( 1, 2, 3, 4, 5 )
                          AND rnk = 1) AS effectiveVersion
               ON child.id = effectiveVersion.id
zyfwsgd6

zyfwsgd61#

This can be solved by window functions:

select *
from (
   select *
   , row_number() over(partition by parentid order by case when effectivetimestamp <= Getdate() then 0 else 1 end, effectivetimestamp DESC) AS rnk
   FROM child
) c
where rnk = 1

row_number creates a column with values 1,2,3 depending on the group (partition) and order (order by).

So for every parent, we want to prioritize effective dates less than getdate at first hand, and finally the latest date.

where rnk = 1 gets us the latest child per parent

euoag5mw

euoag5mw2#

I would suggest a rewrite of the query using the CROSS APPLY (SELECT TOP 1 ... WHERE ... ORDER BY ...) pattern to select the active child with the most recent effective date, or latest future date, if there is no active child.

The trick is setting up the ORDER BY . The following will first sort on the "active" status, placing active rows first. It will the sort by effectivetimestamp descending to get either the latest active or the latest overall..

SELECT 
    p.id AS parentid,
    c.id,
    c.name,
    c.effectivetimestamp
FROM dbo.parent AS p
CROSS APPLY (
    SELECT TOP 1
        c.id,
        c.name,
        c.effectivetimestamp
    FROM dbo.child AS c
    WHERE c.parentid = p.id
    ORDER BY
        CASE WHEN c.effectivetimestamp <= GETDATE() THEN 1 ELSE 2 END, -- Active first
        c.effectivetimestamp DESC -- Latest active (or latest future)
) c
WHERE p.id IN (1, 2, 3, 4, 5, 99)

The above query always select the parent row and returns the parent id. The OUTER APPLY then looks up the most recent active child row. An OUTER APPLY is like a LEFT OUTER JOIN in that if no eligible child is found, it will yield NULL values, while the parent data will remain a part of the results.

Note that the above includes a reference to the parent table and requires a matching parent row to be present. To eliminate this dependency, the initial FROM dbo.parent AS p reference can be replaced with FROM (VALUES (1),(2),(3),(4),(5)) AS p(id) or an existing selected ID list table if you already have one. The WHERE clause could then be dropped.

This pattern is a common method in scenarios where you need to look up the "the best matching X" for a given criteria. General examples would be things like "effective price", "current employee salary", "latest available report" for a given date.

See this db<>fiddle for a demo.

Sample results (with a few extra columns).
| parentid | name | expected | id | name | effectivetimestamp |
| ------------ | ------------ | ------------ | ------------ | ------------ | ------------ |
| 1 | Parent-A | Child-A2 | 12 | Child-A2 | 2023-11-01 12:34 |
| 2 | Parent-B | Child-B1 | 21 | Child-B1 | 2023-02-01 08:00 |
| 3 | Parent-C | Child-C2 - latest future | 31 | Child-C2 | 2088-08-08 08:08 |
| 5 | Parent-E | Child-E2 (reverse id order) | 52 | Child-E2 | 2023-09-01 12:34 |

ekqde3dh

ekqde3dh3#

One option is to count a number of child elements first in a subquery and then apply your criteria

select top(1) with ties parentid id,
       name,
       effectivetimestamp
from (
    select id,
           name,
           effectivetimestamp,
           count(*) over(partition by parentid) cnt
    from  dbo.child 
    where parentid IN ( 1, 2, 3, 4, 5 )     
) t
where effectivetimestamp <= Getdate() or cnt=1
order by row_number() over(partition by parentid order by effectivetimestamp desc)
0ejtzxu1

0ejtzxu14#

I think what you need is to use UNION operator to also get the childs that have a effectiveTimestamp > getDate

Check below query, it will output the desired result:

SELECT * FROM (SELECT parentid,
       child.id,
       name,
       child.effectivetimestamp
FROM   dbo.child AS child
       INNER JOIN (SELECT MAX([id])                AS id,
                          Max([effectivetimestamp]) AS effectiveTimestamp                   
                          FROM   dbo.child
                   WHERE  parentid IN ( 1, 2, 3, 4, 5 )
                          AND effectivetimestamp <= Getdate()
                   GROUP  BY parentid
                   UNION 
                   SELECT MAX([id])                 AS id,
                          Max([effectivetimestamp]) AS effectiveTimestamp        
                   FROM   dbo.child
                   WHERE  parentid not in (SELECT [parentid]                AS id                  
                          FROM   dbo.child
                   WHERE  parentid IN ( 1, 2, 3, 4, 5 )
                          AND effectivetimestamp <= Getdate() 
                   )
                            AND effectivetimestamp > Getdate()
                   group by parentid) AS effectiveVersion
               ON child.id = effectiveVersion.id)as T1

相关问题