SQL Server IF/THEN Statement for NULL values

dsekswqp  于 2023-10-15  发布在  其他
关注(0)|答案(3)|浏览(150)

I am not sure how I can implement an IF/THEN statement in my code.

In the first query I'm creating a table that 1) has dengue cases, 2) concatenates all lab results, 3) creates a travel variable where if the case travelled it will pull the travel location, if they did not then it will be null. From that table, the second query is keeping the last case per day, so removing all earlier cases per day.

There are some cases that are identical in time but have two different travel values. I would like to keep all data that 1) has the most recent InsertedDateTime (this can have NULL travel values), 2) if they have identical time values, then it will keep the case that has a travel value not equal to "NULL".

Ideally, I would like to add a statement that says, "if rn=1 and travel = “NULL” then delete”. Seeing that I'm new to SQL and code predominately in SAS, I'm not sure where this should go or how it would be correctly written. Thanks in advance for not being rude and understanding I'm new (ya'll are pretty harsh). Also, I have no control over how the servers or databases are named.

Example Data:

PatientFirstName PatientLastName PatientDOB InsertedDateTime FacilityName FacilityAddressCounty EncounterDateFrom Reason eCR reported Lab results Travel

Jane Doe 20000101 2/12/2023 16:19 BHospital OldCounty 2/2/2023 8:20 Dengue Virus Infection (as a diagnosis or active problem),Fever (Temperature = 38°C [100.4°F]) NULL NULL

Jane Doe 20000101 2/12/2023 16:19 BHospital OldCounty 2/2/2023 8:20 Dengue Virus Infection (as a diagnosis or active problem),Fever (Temperature = 38°C [100.4°F]) NULL Mexico

DROP Table IF EXISTS #labs 
    SELECT ecr_FK, ObservationResultTypeDisplayName, 
     ObservationResultValueType, ObservationResultValue,        
    ObservationResultValueUnit
    INTO #labs  
    FROM ecr.ecr.dbo.Results 
    where InsertedDateTime >= '2021-01-03' and 
    ObservationResultTypeDisplayName like '%dengue%'
    CREATE NONCLUSTERED INDEX IX_Results_FK ON #labs (eCR_FK) 

    DROP Table IF EXISTS #ReportabilityRules 
    SELECT *  
    INTO #ReportabilityRules  
    FROM ecr.ecr.dbo.ReportabilityRules 
    CREATE NONCLUSTERED INDEX IX_RR_FK ON #ReportabilityRules (RR_FK) 

    DROP TABLE IF EXISTS #ALL
    select  eCR_PK, a.PatientFirstName, a.PatientLastName, a.PatientDOB, a.InsertedDateTime, 
    a.FacilityName, a.FacilityAddressCounty,a.EncounterDateFrom, s.codeDisplayName
    ,(SELECT CASE WHEN ROW_NUMBER() OVER (ORDER BY rr2.RR_fk) = 1 THEN rr2.reportabilityRules ELSE ', ' +    rr2.reportabilityRules END 
         FROM #ReportabilityRules rr2 WHERE d.RR_pk=rr2.RR_fk 
         FOR XML PATH('')) [Reason eCR reported] 
    ,(SELECT CASE WHEN ROW_NUMBER() OVER (ORDER BY l2.eCR_fk) = 1 THEN   concat(l2.ObservationResultTypeDisplayName, ' ', l2.ObservationResultValueType, ' ', l2.ObservationResultValue, ' ', l2.ObservationResultValueUnit)  ELSE ', ' + concat(l2.ObservationResultTypeDisplayName, ' ', l2.ObservationResultValueType, ' ', l2.ObservationResultValue, ' ', l2.ObservationResultValueUnit)  END 
         FROM #labs l2 WHERE l1.eCR_FK=l2.eCR_FK 
         FOR XML PATH('')) [Lab results] 
,case when s.codeDisplayName like '%travel%' then s.obsValue else NULL END [Travel]

    INTO #ALL
    from ecr.dbo.ecr a with (NOLOCK)
    left join ecr.dbo.RR d on a.eCR_PK=d.eCR_FK 
    left join #labs l1 on l1.eCR_FK=a.eCR_PK
    left join eCR.ecr.dbo.social s on s.eCR_FK=a.eCR_PK
    where ConditionDisplayName like '%dengue%' and a.InsertedDateTime >= '2021-01-03'

   select distinct PatientFirstName, PatientLastName, PatientDOB, InsertedDateTime, FacilityName,  FacilityAddressCounty, EncounterDateFrom, [Reason eCR reported], [Lab results], [Travel]
    from 
    (select *,
   ROW_NUMBER () over (partition by patientdob, cast(inserteddatetime as date)
order by inserteddatetime desc) as rn
   from #ALL
   ) as ecr_date 
   where rn=1
kmpatx3s

kmpatx3s1#

BLUF: The problem is not in the ROW_NUMBER() filter logic in the final select, but in the JOIN logic of the preceding query that feeds the #ALL table.

After a deep dive into your posted code, my understanding is that you are gathering information on who has contracted a Dengue Virus Infection and what countries have these patients recently traveled to.

I see what appears to be three independent one-to-many joins in the query that feeds the #ALL temp table - ecr.dbo.RR , #labs , and eCR.ecr.dbo.social . This is likely the primary cause of the duplicate/conflicting rows that you are attempting to resolve.

The #labs join can be eliminated outright. Its only use is already contained within the [Lab results] subquery. The ecr.dbo.RR join can also be eliminated from the outer query if you include it as a JOIN in the [Reason eCR reported] subquery.

This leaves the eCR.ecr.dbo.social table. Two columns are referenced from this table - codeDisplayName and obsValue . The codeDisplayName column is referenced in the #ALL query, but the resulting value is not used by the final select. If it were used, I might suggest a STRING_AGG() concatenation like for [Lab results] and [Reason eCR reported] . But since it is unused, I believe it can just be eliminated outright.

As for obsValue , this depends on whether or not there can be more than one record (per patient encounter) with s.codeDisplayName like '%travel%' and how you need to handle potential multiple eligible obsValue values. One option is to move the Travel calculation into a subquery like [Reason eCR reported] and [Lab results] that calculates an aggregate value (typically MAX() , SUM() , or STRING_AGG() ). Another option is to use a OUTER APPLY(SELECT TOP 1 ... ORDER BY...) to pick at most a single row and use its value.

For now, I am guessing that the travel obsValue value is a string value, possibly a country name. In that case, if there is more than one matching travel record representing multiple travel destinations, I expect we would want to use STRING_AGG() to get them all.

The rewritten #ALL query might be like the following:

select
    eCR_PK, a.PatientFirstName, a.PatientLastName, a.PatientDOB, a.InsertedDateTime, 
    a.FacilityName, a.FacilityAddressCounty,a.EncounterDateFrom, s.codeDisplayName
    ,(SELECT STRING_AGG(rr2.reportabilityRules, ', ')
             WITHIN GROUP(ORDER BY rr2.RR_fk) -- Is there a better order by value?
         FROM ecr.dbo.RR d 
         JOIN #ReportabilityRules rr2 ON d.RR_pk = rr2.RR_fk 
         WHERE on a.eCR_PK = d.eCR_FK
         ) [Reason eCR reported] 
    ,(SELECT STRING_AGG(
             concat(
                l2.ObservationResultTypeDisplayName,
                ' ', l2.ObservationResultValueType,
                ' ', l2.ObservationResultValue,
                ' ', l2.ObservationResultValueUnit)
             , ', ')
             WITHIN GROUP(ORDER BY l2.eCR_fk) -- Is there a better order by value?
         FROM #labs l2
         WHERE l1.eCR_FK = l2.eCR_FK
         ) [Lab results] 
    ,(SELECT STRING_AGG(s.obsValue, ', ')
             WITHIN GROUP(ORDER BY s.???) -- What is the best order by value?
         FROM eCR.ecr.dbo.social s
         WHERE s.eCR_FK = a.eCR_PK
         AND s.codeDisplayName like '%travel%'
         ) [Travel]
INTO #ALL
from ecr.dbo.ecr a with (NOLOCK)
where ConditionDisplayName like '%dengue%'
and a.InsertedDateTime >= '2021-01-03'

I've eliminated all of the one-to-many joins from the outer query, converted the FOR XML subqueries to use STRING_AGG() , and added a few newlines for easier reading.

The above is not tested (we have no posted sample data) and might need some tweaks. In particular, the STRING_AGG() ... ORDER BY clauses need to be reviewed and appropriate sort column referenced selected.

As for the final select, I believe you shoudl be able to eliminate the DISTINCT . I do have a concern about the ROW_NUMBER() ...partition by ... . Since patientdob is not necessarily unique to the patient, I would suggest also including PatientFirstName and PatientLastName as a minimum. Even that is not guaranteed. If you can access a patient ID in your source data, using that would be more reliable.

Since eCR.ecr.dbo.social is a remote table reference, there may be a need to preselect that data (within the relevant date range) and pull it to a local temp table to avoid excessive cross-server accesses.

If you have further clarifications on your requirements, add them to your original post and/or add a comment below and I will look at updating this answer.

tzxcd3kk

tzxcd3kk2#

CASE WHEN expressions can be used to do inline column level "if this then that" type of logic.

Example

"if rn=1 and travel = “NULL” then delete”.

Would be

CASE
      WHEN rn = 1 and travel IS NULL THEN 'delete'
 END

Now I don't understand what you're actually asking is but I think this will help you get there. If you're trying to remove these rows from the table #ALL, then I think it might be best to do your ROW_NUMBER() in the previous query and filter out by these two conditions you have.

lsmepo6l

lsmepo6l3#

It's really hard to be more specific without example DDL/DML here's an example of that:

ECLARE @Table TABLE (TravelDateTime DATETIME, TravelCity NVARCHAR(50), TravelCost DECIMAL(10,2));
INSERT INTO @Table (TravelDateTime, TravelCity, TravelCost) VALUES
('2023-01-01 00:05', NULL, NULL), ('2023-01-01 00:10', 'London',   NULL), ('2023-01-01 00:15', 'London',   123.45), ('2023-01-01 00:20', 'London',   678.90),
('2023-01-02 00:05', NULL, NULL), ('2023-01-02 00:10', 'New York', NULL), ('2023-01-02 00:15', 'New York', 123.45), ('2023-01-02 00:20', 'New York', 678.90);

I realize this simplifies your question down a lot, but as an example of what I think you're shooting for:

SELECT TravelDateTime, TravelCity, TravelCost   
  FROM (
        SELECT TravelDateTime, TravelCity, TravelCost, ROW_NUMBER() OVER (PARTITION BY CAST(TravelDateTime AS DATE) ORDER BY CASE WHEN TravelCity IS NULL OR TravelCost IS NULL THEN CAST('9999-12-31 23:59:59.997' AS DATETIME) ELSE TravelDateTime END) AS RowNumber
          FROM @Table
       ) a
 WHERE RowNumber = 1;

Here we're assigning a Row Number for each row, partition by the date and then ordered by the date and time, unless one, or both, of the columns we care about are null, in which case we treat it as the end of time. Then the outer query only returns the earliest datetime for that date, that has non-nulls in both columns.

TravelDateTimeTravelCityTravelCost
2023-01-01 00:15:00.000London123.45
2023-01-02 00:15:00.000New York123.45

Example using OPs example data (assumed data types) provided in the comments (assuming partition on first, last and DOB):

DECLARE @Table TABLE (PatientFirstName NVARCHAR(50), PatientLastName NVARCHAR(50), PatientDOB DATE, InsertedDateTime DATETIME, FacilityName NVARCHAR(50), FacilityAddressCounty NVARCHAR(50), 
                      EncounterDateFrom DATETIME, [Reason eCR reported Lab] NVARCHAR(500), results NVARCHAR(50), Travel  NVARCHAR(50)
)
INSERT INTO @Table (PatientFirstName, PatientLastName, PatientDOB, InsertedDateTime, FacilityName, FacilityAddressCounty, EncounterDateFrom, [Reason eCR reported Lab], results, Travel) VALUES
('Jane', 'Doe', '20000101', '2/12/2023 16:19', 'BHospital', 'OldCounty', '2/2/2023 8:20', 'Dengue Virus Infection (as a diagnosis or active problem),Fever (Temperature = 38°C [100.4°F])', NULL, NULL), 
('Jane', 'Doe', '20000101', '2/12/2023 16:19', 'BHospital', 'OldCounty', '2/2/2023 8:20', 'Dengue Virus Infection (as a diagnosis or active problem),Fever (Temperature = 38°C [100.4°F])', NULL, 'Mexico');

SELECT PatientFirstName, PatientLastName, PatientDOB, InsertedDateTime, FacilityName, FacilityAddressCounty, EncounterDateFrom, [Reason eCR reported Lab], results, Travel
  FROM (
        SELECT PatientFirstName, PatientLastName, PatientDOB, InsertedDateTime, FacilityName, FacilityAddressCounty, EncounterDateFrom, [Reason eCR reported Lab], results, Travel, 
               ROW_NUMBER() OVER (PARTITION BY PatientFirstName, PatientLastName, PatientDOB ORDER BY CASE WHEN Travel IS NULL THEN '9999-12-31 23:59:59.997' ELSE InsertedDateTime END) AS RowNumber
          FROM @Table
       ) a
 WHERE RowNumber = 1;
PatientFirstNamePatientLastNamePatientDOBInsertedDateTimeFacilityNameFacilityAddressCountyEncounterDateFromReason eCR reported LabresultsTravel
JaneDoe2000-01-012023-02-12 16:19:00.000BHospitalOldCounty2023-02-02 08:20:00.000Dengue Virus Infection (as a diagnosis or active problem),Fever (Temperature = 38°C [100.4°F])Mexico

相关问题