SQL Server 根据共享列值和连续日期范围计算等级(具有连续范围的记录具有相同等级)

4ioopgfo 于 2022-12-10 发布在其他

关注(0)|答案(2)|浏览(127)

I am trying to get the rank of a table that has specific id's and a start and end date for each record, as such:
| id1 | id2 | flag | startdate | enddate |
| ------------ | ------------ | ------------ | ------------ | ------------ |
| 1 | 1 | y | 2007-01-10 | 2007-02-12 |
| 1 | 1 | y | 2007-02-13 | 2007-08-04 |
| 1 | 1 | y | 2007-08-05 | 2008-10-04 |
| 1 | 1 | n | 2008-10-05 | 2008-11-14 |
| 1 | 1 | n | 2008-11-15 | 2008-12-02 |
| 1 | 1 | n | 2008-12-08 | 2008-12-20 |
| 2 | 2 | y | 2012-01-10 | 2012-02-12 |
| 2 | 2 | y | 2012-02-13 | 2012-08-04 |
| 2 | 3 | y | 2012-01-10 | 2012-02-14 |
| 2 | 4 | y | 2012-08-14 | 2013-01-10 |
| 2 | 4 | y | 2013-01-15 | 2013-01-26 |
| 2 | 4 | y | 2013-01-27 | 2013-02-04 |
| 2 | 4 | n | 2016-03-14 | 2016-04-12 |
Where I essentially want to give the same count value to all records which share the same id1, id2, and flag, and are consecutive in their dates. Consecutive, meaning the start date of one record is equal to the end date of the previous record + 1 day. The desired output should look like:
| id1 | id2 | flag | startdate | enddate | rank_t |
| ------------ | ------------ | ------------ | ------------ | ------------ | ------------ |
| 1 | 1 | y | 2007-01-10 | 2007-02-12 | 1 |
| 1 | 1 | y | 2007-02-13 | 2007-08-04 | 1 |
| 1 | 1 | y | 2007-08-05 | 2008-10-04 | 1 |
| 1 | 1 | n | 2008-10-05 | 2008-11-14 | 2 |
| 1 | 1 | n | 2008-11-15 | 2008-12-02 | 2 |
| 1 | 1 | n | 2008-12-08 | 2008-12-20 | 3 |
| 2 | 2 | y | 2012-01-10 | 2012-02-12 | 4 |
| 2 | 2 | y | 2012-02-13 | 2012-08-04 | 4 |
| 2 | 3 | y | 2012-01-10 | 2012-02-14 | 5 |
| 2 | 4 | y | 2012-08-14 | 2013-01-10 | 6 |
| 2 | 4 | y | 2013-01-15 | 2013-01-26 | 7 |
| 2 | 4 | y | 2013-01-27 | 2013-02-04 | 7 |
| 2 | 4 | n | 2016-03-14 | 2016-04-12 | 8 |
The output or rank does not have to be in that exact order, but the idea is still the same. Records which share the same id1, id2, and flag, and are consecutive in their dates should all have the same rank. And that rank value should not be used again for any other 'group' of records.
Here is the code to generate a temp table with this structure:

if object_id('tempdb..#temp1') is not null drop table #temp1
CREATE TABLE #temp1 (id1 INT, id2 int, flag varchar(10), startdate DATETIME, enddate DATETIME)
INSERT INTO #temp1 values
(1, 1, 'y', '2007-01-10', '2007-02-12'),
(1, 1, 'y', '2007-02-13', '2007-08-04'),
(1, 1,'y', '2007-08-05', '2008-10-04'),
(1, 1,'n', '2008-10-05', '2008-11-14'),
(1, 1,'n', '2008-11-15', '2008-12-02'),
(1, 1,'n', '2008-12-08', '2008-12-20'),
(2, 2,'y', '2012-01-10', '2012-02-12'),
(2, 2,'y', '2012-02-13', '2012-08-04'),
(2, 3,'y', '2012-01-10', '2012-02-14'),
(2, 4,'y', '2012-08-14', '2013-01-10'),
(2, 4,'y', '2013-01-15', '2013-01-26'),
(2, 4,'y', '2013-01-27', '2013-02-04'),
(2, 4,'n', '2016-03-14', '2016-04-12')

Thanks in advance for any help.

sql-server

来源：https://stackoverflow.com/questions/74737709/calculate-rank-based-on-shared-column-values-and-consecutive-date-ranges-same-r

2条答案

按热度按时间

hfwmuf9z1#

与现有答案的逻辑相同......只是作为2个CTE完成（我发现这比CTE+子查询的组合更清楚）。

with cte1 as (
    select *
        --  Identify if there is a gap between the current startdate and the previous enddate
        , case when lag(enddate,1,dateadd(day,-1,startdate)) over (partition by id1, id2, flag order by startdate asc) = dateadd(day,-1,startdate) then 0 else 1 end DateGap
    from #temp1
), cte2 as (
    select *
        -- Sum every time a gap is detected to generate a new partition
        , sum(DateGap) over (order by startdate asc) DateGapSum
    from cte1
)
select id1, id2, flag, startdate, enddate
    -- Use dense_rank to generate the ranking where ties are allocated the same value
    , dense_rank() over (order by id1 asc, id2 asc, flag desc, DateGapSum asc) rank_t
from cte2
order by id1, id2, startdate;

赞(0）回复(0）举报 2022-12-10

5kgi1eie2#

Try the following:

WITH T AS 
(
  SELECT *,
    IIF(
        DATEDIFF(DAY, 
                 LAG(enddate, 1, DATEADD(DAY, -1, startdate)) OVER (PARTITION BY id1, id2 ORDER BY enddate),
                 startdate)=1, 
        0, 1) AS chk
  FROM #temp1
)
SELECT id1, id2, flag, startdate, enddate, 
  DENSE_RANK() OVER (ORDER BY id1, id2, flag DESC, grp) AS rank_t
FROM
(
  SELECT *, 
  SUM(CHK) OVER (PARTITION BY id1, id2 ORDER BY enddate) AS grp 
  FROM T
) Groups
ORDER BY id1, id2, startdate

See a demo .
In the CTE T, we check if the date difference between the start date and the previous end date is equal to one, and set a value of zero if that condition is met and a value of one if that condition is not met.
Now, using a running sum of the chk value populated in the CTE, we can define unique groups for the consecutive rows.
At the end, we used DENSE_RANK function using the order (id1, id2, flag DESC, grp) to generate the required ranks.

赞(0）回复(0）举报 2022-12-10

我来回答

SQL Server 根据共享列值和连续日期范围计算等级(具有连续范围的记录具有相同等级)

2条答案

相关问题

热门标签

最新问答