SQL Server Check ID's in one subquery and if not returned check in another

lymgl2op  于 2024-01-05  发布在  其他
关注(0)|答案(4)|浏览(118)

We have two SQL Server tables, dbo.parent and dbo.child . It is a one-to-many relationship with one parent having many child elements. Each child element has a unique name and effectiveTimestamp value.

The current working query below will take an array of known parentid's and for each find the current effective child from dbo.child . The effective child is calculated by getting the child with the max effectiveTimestamp that is still less than the current date/time. This query works fine.

It's possible for a parent to have only child elements with a future dated effectiveTimestamp (greater than current date/time). In this scenario, the above query won't return anything for those parentid's. Currently we manage this by running a different, 2nd query for any parentid's that aren't returned in the first query, however I feel this is unnecessary and want to include everything under one SQL query.

Therefore I'm looking to amend the query below, so if the parentid does not have an effective child then it looks for the max effectiveTimestamp regardless whether or not is is less than the current date/time.

  1. SELECT
  2. parentid,
  3. child.id,
  4. name,
  5. child.effectivetimestamp
  6. FROM
  7. dbo.child AS child
  8. INNER JOIN
  9. (SELECT
  10. Max([id]) AS id,
  11. Max([effectivetimestamp]) AS effectiveTimestamp
  12. FROM
  13. dbo.child
  14. WHERE
  15. parentid IN (1, 2, 3, 4, 5)
  16. AND effectivetimestamp <= GETDATE()
  17. GROUP BY
  18. parentid) AS effectiveVersion ON child.id = effectiveVersion.id

ANSWER

Thanks everyone for the response, I received multiple solutions that I believe would have worked. In the end I used row_number() as suggested by siggemannen. Full solution below:

  1. SELECT parentid,
  2. child.id,
  3. name,
  4. child.effectivetimestamp
  5. FROM dbo.child AS child
  6. INNER JOIN (SELECT *
  7. FROM (SELECT *,
  8. Row_number()
  9. OVER(
  10. partition BY parentid
  11. ORDER BY CASE WHEN effectivetimestamp <=
  12. Getdate(
  13. ) THEN 0
  14. ELSE 1 END
  15. ,
  16. effectivetimestamp DESC) AS 'rnk'
  17. FROM dbo.child) AS c
  18. WHERE parentid IN ( 1, 2, 3, 4, 5 )
  19. AND rnk = 1) AS effectiveVersion
  20. ON child.id = effectiveVersion.id
zyfwsgd6

zyfwsgd61#

This can be solved by window functions:

  1. select *
  2. from (
  3. select *
  4. , row_number() over(partition by parentid order by case when effectivetimestamp <= Getdate() then 0 else 1 end, effectivetimestamp DESC) AS rnk
  5. FROM child
  6. ) c
  7. where rnk = 1

row_number creates a column with values 1,2,3 depending on the group (partition) and order (order by).

So for every parent, we want to prioritize effective dates less than getdate at first hand, and finally the latest date.

where rnk = 1 gets us the latest child per parent

euoag5mw

euoag5mw2#

I would suggest a rewrite of the query using the CROSS APPLY (SELECT TOP 1 ... WHERE ... ORDER BY ...) pattern to select the active child with the most recent effective date, or latest future date, if there is no active child.

The trick is setting up the ORDER BY . The following will first sort on the "active" status, placing active rows first. It will the sort by effectivetimestamp descending to get either the latest active or the latest overall..

  1. SELECT
  2. p.id AS parentid,
  3. c.id,
  4. c.name,
  5. c.effectivetimestamp
  6. FROM dbo.parent AS p
  7. CROSS APPLY (
  8. SELECT TOP 1
  9. c.id,
  10. c.name,
  11. c.effectivetimestamp
  12. FROM dbo.child AS c
  13. WHERE c.parentid = p.id
  14. ORDER BY
  15. CASE WHEN c.effectivetimestamp <= GETDATE() THEN 1 ELSE 2 END, -- Active first
  16. c.effectivetimestamp DESC -- Latest active (or latest future)
  17. ) c
  18. WHERE p.id IN (1, 2, 3, 4, 5, 99)

The above query always select the parent row and returns the parent id. The OUTER APPLY then looks up the most recent active child row. An OUTER APPLY is like a LEFT OUTER JOIN in that if no eligible child is found, it will yield NULL values, while the parent data will remain a part of the results.

Note that the above includes a reference to the parent table and requires a matching parent row to be present. To eliminate this dependency, the initial FROM dbo.parent AS p reference can be replaced with FROM (VALUES (1),(2),(3),(4),(5)) AS p(id) or an existing selected ID list table if you already have one. The WHERE clause could then be dropped.

This pattern is a common method in scenarios where you need to look up the "the best matching X" for a given criteria. General examples would be things like "effective price", "current employee salary", "latest available report" for a given date.

See this db<>fiddle for a demo.

Sample results (with a few extra columns).
| parentid | name | expected | id | name | effectivetimestamp |
| ------------ | ------------ | ------------ | ------------ | ------------ | ------------ |
| 1 | Parent-A | Child-A2 | 12 | Child-A2 | 2023-11-01 12:34 |
| 2 | Parent-B | Child-B1 | 21 | Child-B1 | 2023-02-01 08:00 |
| 3 | Parent-C | Child-C2 - latest future | 31 | Child-C2 | 2088-08-08 08:08 |
| 5 | Parent-E | Child-E2 (reverse id order) | 52 | Child-E2 | 2023-09-01 12:34 |

展开查看全部
ekqde3dh

ekqde3dh3#

One option is to count a number of child elements first in a subquery and then apply your criteria

  1. select top(1) with ties parentid id,
  2. name,
  3. effectivetimestamp
  4. from (
  5. select id,
  6. name,
  7. effectivetimestamp,
  8. count(*) over(partition by parentid) cnt
  9. from dbo.child
  10. where parentid IN ( 1, 2, 3, 4, 5 )
  11. ) t
  12. where effectivetimestamp <= Getdate() or cnt=1
  13. order by row_number() over(partition by parentid order by effectivetimestamp desc)
0ejtzxu1

0ejtzxu14#

I think what you need is to use UNION operator to also get the childs that have a effectiveTimestamp > getDate

Check below query, it will output the desired result:

  1. SELECT * FROM (SELECT parentid,
  2. child.id,
  3. name,
  4. child.effectivetimestamp
  5. FROM dbo.child AS child
  6. INNER JOIN (SELECT MAX([id]) AS id,
  7. Max([effectivetimestamp]) AS effectiveTimestamp
  8. FROM dbo.child
  9. WHERE parentid IN ( 1, 2, 3, 4, 5 )
  10. AND effectivetimestamp <= Getdate()
  11. GROUP BY parentid
  12. UNION
  13. SELECT MAX([id]) AS id,
  14. Max([effectivetimestamp]) AS effectiveTimestamp
  15. FROM dbo.child
  16. WHERE parentid not in (SELECT [parentid] AS id
  17. FROM dbo.child
  18. WHERE parentid IN ( 1, 2, 3, 4, 5 )
  19. AND effectivetimestamp <= Getdate()
  20. )
  21. AND effectivetimestamp > Getdate()
  22. group by parentid) AS effectiveVersion
  23. ON child.id = effectiveVersion.id)as T1
展开查看全部

相关问题