SQL Query that returns all missing months for date ranges within a year (Microsoft SQL Server)

3htmauhk  于 2023-02-28  发布在  SQL Server
关注(0)|答案(1)|浏览(144)

An employee is hired for different departments. For example, they could be working in Marketing and be paid by Department 0001 and also Department 0002 depending on the hours they do. For this question, hours doesn't matter as I truly only need to find the date ranges when the employee is NOT hired. The below two employees are examples of those.
| Employee_ID | Department_ID | Start_Date | End_Date |
| ------------ | ------------ | ------------ | ------------ |
| 395961 | 0001 | 01/01/2022 | 03/31/2022 |
| 395961 | 0002 | 01/01/2022 | 01/31/2022 |
| 395961 | 0001 | 06/01/2022 | 10/31/2022 |
| 395961 | 0001 | 11/01/2022 | 12/31/2022 |
| 395962 | 0001 | 01/01/2022 | 03/31/2022 |
| 395962 | 0002 | 01/01/2022 | 01/31/2022 |
| 395962 | 0002 | 02/01/2022 | 04/30/2022 |
| 395962 | 0001 | 06/01/2022 | 10/31/2022 |
| 395962 | 0001 | 11/01/2022 | 12/31/2022 |
| 395963 | 0001 | 01/01/2022 | 05/31/2022 |
| 395963 | 0002 | 06/01/2022 | 12/31/2022 |

Employee_ID 395961 and 395962 should return for the query as both have date range gaps. 395961 does not have a date range from 04/01/2022 until 05/31/2022. 395962 does not have a date range from 05/01/2022 until 05/31/2022. 395963 does NOT have a date range gap.

I had in mind to return the months the employee had date ranges and group them by month as we can get duplicate returns. For example, Employee_ID 395961 would return Jan, Feb, March, Jan, Jun, Jul, Aug, Sep, Oct, Nov, Dec and when we group the months, it would remove the duplicate Jan. Then we check which months are missing and if anything returns, we should consider to return this Employee_ID as result.

I truly have NOT figured out how to get a correct result, but I've done some research to find the below that could help. I wouldn't want to use variables as I want this query applied to 1000's of records.

SELECT  DATENAME(MONTH, DATEADD(MONTH, x.number, @Start_Date)) AS MonthName
FROM    master.dbo.spt_values x
WHERE   x.type = 'P'        
AND     x.number <= DATEDIFF(MONTH, @Start_Date, @End_Date)

Edit: The database is Microsoft SQL Server 2017.

wljmcqd8

wljmcqd81#

I think this should give you what you are looking for:

WITH cal (date) AS (
    SELECT @Start_Date
    UNION ALL
    SELECT DATEADD(day, 1, date)
    FROM cal
    WHERE date < @End_Date
)
SELECT e.Employee_ID, COUNT(cal.date) AS Days_Not_Hired
FROM cal
CROSS JOIN (SELECT DISTINCT Employee_ID FROM employees) e
LEFT JOIN employees e2
    ON cal.date BETWEEN e2.Start_Date AND e2.End_Date
    AND e.Employee_ID = e2.Employee_ID
WHERE e2.Employee_ID IS NULL
GROUP BY e.Employee_ID
OPTION(MAXRECURSION 365);

db<>fiddle

This is targeted at SQL Server. It may need tweaking for other dialects.

To run for all employees for all time you could do:

WITH cal (date) AS (
    SELECT MIN(Start_Date) FROM employees
    UNION ALL
    SELECT DATEADD(day, 1, date)
    FROM cal
    WHERE date < CAST( GETDATE() AS DATE )
)
SELECT e.*, COUNT(cal.date) AS Days_Not_Hired
FROM cal
JOIN (
    SELECT
        Employee_ID,
        MIN(Start_Date) AS Min_Start_Date,
        MAX(End_Date) AS Max_End_Date
    FROM employees
    GROUP BY Employee_ID
) e
    ON cal.date BETWEEN e.Min_Start_Date AND e.Max_End_Date
LEFT JOIN employees e2
    ON cal.date BETWEEN e2.Start_Date AND e2.End_Date
    AND e.Employee_ID = e2.Employee_ID
WHERE e2.Employee_ID IS NULL
GROUP BY e.Employee_ID, e.Min_Start_Date, e.Max_End_Date
OPTION(MAXRECURSION 0);

db<>fiddle

I very rarely use SQL Server so I have no idea how its optimiser is going to deal with this query.

相关问题