I have a table of employee assignment START_DATEs and STOP_DATEs. Employees only have one assignment at a time, so after one assignment stops another begins eventually. Assignments that are still in progress have a STOP_DATE represented as the minimum datetime, '1753-01-01'. Something like this:
| EMPLOYEE | START_DATE | STOP_DATE |
| ------------ | ------------ | ------------ |
| 51 | 2023-07-05 | 2023-07-06 |
| 51 | 2023-07-07 | 2023-07-10 |
| 51 | 2023-07-11 | 1753-01-01 |
| 32 | 2023-07-04 | 2023-07-06 |
| 32 | 2023-07-06 | 2023-07-07 |
| 32 | 2023-07-11 | 2023-07-12 |
I'm trying to filter the table to show me all the latest assignments. In other words, I'm trying to get all employee assignments that are currently in progress (STOP_DATE '1753-01-01'), or, if they don't have any assignments that are currently in progress, to get the assignment with the latest STOP_DATE:
EMPLOYEE | STOP_DATE |
---|---|
51 | 1753-01-01 |
32 | 2023-07-12 |
I can use a ROW_NUMBER/PARTITION BY to isolate the latest stop date, but figuring out how to ignore an employee's latest stop date if they have a stop date of 1753-01-01 somewhere is giving me trouble.
4条答案
按热度按时间4sup72z81#
I think just a simple
ROW_NUMBER()
should do it.rryofs0p2#
Windowed functions to the rescue!
Here we're using MIN and MAX windowed functions to find the first and last stop dates for each employee. Since we can't reference them in the where clause, we're using them as a subquery, and filtering on the outer one.
1hdlvixo3#
It's a bit meaningless to show a date in 1753 as indicating something is in progress - just a NULL here would be preferable since it doesn't have a stop date yet.
Based on your description it would make more sense to just aggregate as follows and just indicate it's actually in progress:
w7t8yxp54#
This should be safe and sufficiently fast, assuming you have an index on
(Employee, StopDate)
: