SQL Server How to get entries where their date range conflicts

gywdnpxw  于 2023-04-19  发布在  其他
关注(0)|答案(2)|浏览(149)

This is my data:

hmy     dtStart         dtEnd
14565   2016-07-01  2021-06-30 
17237   2016-10-01  2021-06-30
552400  2021-07-01  2021-09-30 
822316  2021-10-01  2024-12-30

Here, 14565 and 17237 has a conflict as it contains the same dates in their range.

So I need to remove 14565 from my dataset, because 17237 is newer as its start date is after 14565's start date. We can also understand it as its hmy is larger.

Desired output:

hmy     dtStart         dtEnd
17237   2016-10-01  2021-06-30
552400  2021-07-01  2021-09-30 
822316  2021-10-01  2024-12-30

How can I do it?

I use SQL Server 2018

I tried:

;WITH cte AS (
  SELECT hmy, dtStart, dtEnd,
         ROW_NUMBER() OVER (PARTITION BY dtStart ORDER BY dtEnd DESC) AS rn
  FROM #mytable
)

DELETE FROM #mytable
WHERE hmy IN (SELECT hmy FROM cte WHERE rn > 1)

didn't work as expected.

tvokkenx

tvokkenx1#

We're going to check if the next dtStart > dtEnd and if not we will not show it.

select hmy  
      ,dtStart  
      ,dtEnd
from
(
select *
      ,lead(dtStart) over(order by dtStart) as ld
from   t
) t
where  dtEnd < ld or ld is null
hmydtStartdtEnd
172372016-10-01 00:00:00.0002021-06-30 00:00:00.000
5524002021-07-01 00:00:00.0002021-09-30 00:00:00.000
8223162021-10-01 00:00:00.0002024-12-30 00:00:00.000

Fiddle

v8wbuo2f

v8wbuo2f2#

First we sort data by dtStart using ROW_NUMBER() then we self join our sorted data to check if there are any overlap of date ranges using the condition new_start < existing_end AND new_end > existing_start; :

WITH cte AS (
  SELECT *,
         ROW_NUMBER() OVER (ORDER BY dtStart ASC) AS rn
  FROM #mytable
)
select c1.hmy, c1.dtStart, c1.dtEnd
from cte c1
left join cte c2 on c2.rn = c1.rn + 1 
          and c2.dtStart < c1.dtEnd 
          and c2.dtEnd >= c1.dtStart
where c2.hmy is null

To remove them then use this query :

WITH cte AS (
  SELECT *,
         ROW_NUMBER() OVER (ORDER BY dtStart ASC) AS rn
  FROM #mytable
)
delete from #mytable
where hmy in (
  select c1.hmy
  from cte c1
  left join cte c2 on c2.rn = c1.rn + 1 
          and c2.dtStart < c1.dtEnd 
          and c2.dtEnd >= c1.dtStart
  where c2.hmy is not null
)

Demo here

相关问题