We have two SQL Server tables, dbo.parent
and dbo.child
. It is a one-to-many relationship with one parent having many child elements. Each child element has a unique name and effectiveTimestamp
value.
The current working query below will take an array of known parentid's and for each find the current effective child from dbo.child
. The effective child is calculated by getting the child with the max effectiveTimestamp
that is still less than the current date/time. This query works fine.
It's possible for a parent to have only child elements with a future dated effectiveTimestamp
(greater than current date/time). In this scenario, the above query won't return anything for those parentid's. Currently we manage this by running a different, 2nd query for any parentid's that aren't returned in the first query, however I feel this is unnecessary and want to include everything under one SQL query.
Therefore I'm looking to amend the query below, so if the parentid does not have an effective child then it looks for the max effectiveTimestamp
regardless whether or not is is less than the current date/time.
SELECT
parentid,
child.id,
name,
child.effectivetimestamp
FROM
dbo.child AS child
INNER JOIN
(SELECT
Max([id]) AS id,
Max([effectivetimestamp]) AS effectiveTimestamp
FROM
dbo.child
WHERE
parentid IN (1, 2, 3, 4, 5)
AND effectivetimestamp <= GETDATE()
GROUP BY
parentid) AS effectiveVersion ON child.id = effectiveVersion.id
ANSWER
Thanks everyone for the response, I received multiple solutions that I believe would have worked. In the end I used row_number() as suggested by siggemannen. Full solution below:
SELECT parentid,
child.id,
name,
child.effectivetimestamp
FROM dbo.child AS child
INNER JOIN (SELECT *
FROM (SELECT *,
Row_number()
OVER(
partition BY parentid
ORDER BY CASE WHEN effectivetimestamp <=
Getdate(
) THEN 0
ELSE 1 END
,
effectivetimestamp DESC) AS 'rnk'
FROM dbo.child) AS c
WHERE parentid IN ( 1, 2, 3, 4, 5 )
AND rnk = 1) AS effectiveVersion
ON child.id = effectiveVersion.id
4条答案
按热度按时间zyfwsgd61#
This can be solved by window functions:
row_number
creates a column with values 1,2,3 depending on the group (partition) and order (order by).So for every parent, we want to prioritize effective dates less than getdate at first hand, and finally the latest date.
where rnk = 1
gets us the latest child per parenteuoag5mw2#
I would suggest a rewrite of the query using the
CROSS APPLY (SELECT TOP 1 ... WHERE ... ORDER BY ...)
pattern to select the active child with the most recent effective date, or latest future date, if there is no active child.The trick is setting up the
ORDER BY
. The following will first sort on the "active" status, placing active rows first. It will the sort by effectivetimestamp descending to get either the latest active or the latest overall..The above query always select the parent row and returns the parent id. The
OUTER APPLY
then looks up the most recent active child row. AnOUTER APPLY
is like aLEFT OUTER JOIN
in that if no eligible child is found, it will yield NULL values, while the parent data will remain a part of the results.Note that the above includes a reference to the parent table and requires a matching parent row to be present. To eliminate this dependency, the initial
FROM dbo.parent AS p
reference can be replaced withFROM (VALUES (1),(2),(3),(4),(5)) AS p(id)
or an existing selected ID list table if you already have one. TheWHERE
clause could then be dropped.This pattern is a common method in scenarios where you need to look up the "the best matching X" for a given criteria. General examples would be things like "effective price", "current employee salary", "latest available report" for a given date.
See this db<>fiddle for a demo.
Sample results (with a few extra columns).
| parentid | name | expected | id | name | effectivetimestamp |
| ------------ | ------------ | ------------ | ------------ | ------------ | ------------ |
| 1 | Parent-A | Child-A2 | 12 | Child-A2 | 2023-11-01 12:34 |
| 2 | Parent-B | Child-B1 | 21 | Child-B1 | 2023-02-01 08:00 |
| 3 | Parent-C | Child-C2 - latest future | 31 | Child-C2 | 2088-08-08 08:08 |
| 5 | Parent-E | Child-E2 (reverse id order) | 52 | Child-E2 | 2023-09-01 12:34 |
ekqde3dh3#
One option is to count a number of child elements first in a subquery and then apply your criteria
0ejtzxu14#
I think what you need is to use UNION operator to also get the childs that have a effectiveTimestamp > getDate
Check below query, it will output the desired result: