SQL Server How to fill in records where dates have overlapped

nwsw7zdq  于 2023-10-15  发布在  其他
关注(0)|答案(1)|浏览(78)

Working with healthcare data in SQL and I'm trying to fill in records when there are overlapping dates for thousands of records. This mostly occurs when a patient is in a nursing home and then transferred to a hospital. They have not been discharged yet from the nursing home because their bed is being kept available for them, so that's why the discharge date is missing. Here's one example of what the records look like:
| Patient | Admission_Date | Discharge_Date | Location |
| ------------ | ------------ | ------------ | ------------ |
| ABC | 1/2/2021 | MISSING | NURSING HOME |
| ABC | 2/3/2021 | 2/4/2021 | ICU |
| ABC | 4/10/2021 | 4/13/2021 | HOSPITAL |

Here's what I need it to look like:

PatientAdmission_DateDischarge_DateLocation
ABC1/2/20212/3/2021NURSING HOME
ABC2/3/20212/4/2021ICU
ABC2/4/20214/10/2021NURSING HOME
ABC4/10/20214/13/2021HOSPITAL
ABC4/13/2021MISSINGNURSING HOME

Any ideas on how to get this done?

I've tried using a combination of multiple CASE WHEN statements and lag/leads but it's obviously very inefficient and difficult to account for all of the possible records between nursing home stays.

u7up0aaq

u7up0aaq1#

For the exact example from above the solution is here. I haven't tested it for multiple patients or multiple missing dates. But I can, if it's not enough.

CREATE TABLE Patient_Locations (
    Patient nvarchar(100) NOT NULL
  , Admission_Date date NOT NULL
  , Discharge_Date date NULL
  , Location nvarchar(100)
)

INSERT INTO Patient_Locations (Patient, Admission_Date, Discharge_Date, Location)
VALUES ('ABC',  '20210102', NULL, 'NURSING HOME')
     , ('ABC',  '20210203', '20210204', 'ICU')
     , ('ABC',  '20210410', '20210413', 'HOSPITAL')

;WITH Next_Known_Dates (Patient, Admission_Date, Discharge_Date, Location, NEXT_Admission_Date)
AS (
    SELECT Patient
         , Admission_Date
         , Discharge_Date
         , Location
         , NEXT_Admission_Date = LEAD(Admission_Date, 1) OVER (Partition by Patient ORDER BY Admission_Date)
    FROM Patient_Locations
)
-- First row where Discharge_Date is Missing. Replace by next known period
SELECT Patient
     , Admission_Date
     , Discharge_Date = NEXT_Admission_Date
     , Location
FROM Next_Known_Dates
WHERE Discharge_Date IS NULL
UNION ALL
-- Just normal rows
SELECT Patient
     , Admission_Date
     , Discharge_Date
     , Location
FROM Patient_Locations
WHERE Discharge_Date IS NOT NULL
UNION ALL
--Little magic. Replace missing dates with 'normal' Discharge_date and NEXT_Admission_Date
SELECT Patient_Locations.Patient
     , Next_Known_Dates.Discharge_Date
     , Next_Known_Dates.NEXT_Admission_Date
     , Patient_Locations.Location
FROM Patient_Locations
JOIN Next_Known_Dates 
  ON Next_Known_Dates.Patient = Patient_Locations.Patient
 AND Next_Known_Dates.Admission_Date > Patient_Locations.Admission_Date
WHERE Patient_Locations.Discharge_Date IS NULL
  AND Next_Known_Dates.Discharge_Date  IS NOT NULL
ORDER BY 1, 2, 3

You can try it here: https://sqlize.online/sql/mssql2022/14e363a81956b2e9019fc44daf47a8c7/

Look at this select for the Idea of a new column NEXT_Admission_Date:

SELECT Patient
     , Admission_Date
     , Discharge_Date
     , Location
     , NEXT_Admission_Date = LEAD(Admission_Date, 1) OVER (Partition by Patient ORDER BY Admission_Date)
FROM Patient_Locations

相关问题