SQL Server Conditional preceding values

k10s72fa  于 2023-05-05  发布在  其他
关注(0)|答案(3)|浏览(118)

My sample data is like this

drop 
  table if exists #temp
select 
  * into #temp
from 
  (
    values 
      ('id100', 'status1', 1), 
      ('id100', 'status2', 2), 
      ('id100', 'status1', 3), 
      ('id100', 'status0', 4), 
      ('id100', 'status2', 5), 
      ('id100', 'status2', 6), 
      ('id100', 'status1', 7), 
      ('id100', 'status1', 8), 
      ('id100', 'status2', 9), 
      ('id101', 'status1', 10), 
      ('id101', 'status2', 11)
  ) t(id, status, rowNum)

I need TSQL to return immediately preceding rowNum for each id with 'status2' where the status='status1'. I hope the code to return this

idstatusrowNumvalue
id100status11
id100status221
id100status13
id100status04
id100status253
id100status263
id100status17
id100status18
id100status298
id101status110
id101status21110

I tried this which did not work

SELECT 
  t1.id, 
  t1.status, 
  t1.rowNum, 
  (
    select 
      MIN(t2.rowNum) 
    from 
      #temp t2 
    where 
      t2.id = t1.id 
      and t2.rowNum < t1.rowNum 
      and t1.status = 'status2'
  ) as test 
from 
  #temp t1
wribegjk

wribegjk1#

You can use subquery within a case statement to do so:

Query:

select *, CASE WHEN status='status2' then (select max(rowNum) from #temp tmp 
    where tmp.rowNum<t.rowNum and tmp.status='status1') end value
  from #temp t
order by rowNum

Output:

idstatusrowNumvalue
id100status11null
id100status221
id100status13null
id100status04null
id100status253
id100status263
id100status17null
id100status18null
id100status298
id101status110null
id101status21110

fiddle

You can also use last_value() window function or lag() window function instead of subquery:

Query (with last_value()over()):

select *, CASE WHEN status='status2' then 
    (last_value(case when status='status1' then rowNum else null end) ignore nulls over(order by rowNum)) end value
  from #temp t
order by rowNum

Output:

idstatusrowNumvalue
id100status11null
id100status221
id100status13null
id100status04null
id100status253
id100status263
id100status17null
id100status18null
id100status298
id101status110null
id101status21110

Query (with lag()over()):

select *, CASE WHEN status='status2' then 
    (lag(case when status='status1' then rowNum else null end) ignore nulls over(order by rowNum)) end value
  from #temp t
order by rowNum

Output:

idstatusrowNumvalue
id100status11null
id100status221
id100status13null
id100status04null
id100status253
id100status263
id100status17null
id100status18null
id100status298
id101status110null
id101status21110

fiddle

u5i3ibmn

u5i3ibmn2#

You can use a simple correlated query:

select *,
  case when status = 'status2' then (
    select top(1) Cast(rowNum as varchar(2))
    from #temp t2 
    where t2.id = t.id and t2.status= 'status1' and t2.rowNum < t.rowNum
    order by rowNum desc
  )
  else '' end [Value]
from #temp t
order by id, rowNum;

Demo Fiddle

This simply follow the specification to select the first qualifying row based on the predicates for Status.

In order to meet your desired results it's cast to type varchar, if you are happy with NULL instead you can omit the cast and the else part.

This would be optimally helped with a clustered index on Id and rowNum.

l7mqbcuq

l7mqbcuq3#

If you are lucky enough to be running SQL Server 2022, we can just use last_value() with the ignore nulls option - as demonstrated in the answer of Kazi Mohammad Ali Nur .

In earlier versions, where the option is not available, one approach uses some gaps-and-island technique here.

select t.*,
    case when status = 'status2' then
        max(case when status = 'status1' then rownum end) 
            over(partition by id, grp order by rownum) 
    end value
from (
    select t.*,
        sum(case when status = 'status1' then 1 else 0 end) 
            over(partition by id order by rownum) grp
    from #temp t
) t
order by id, rownum

Islands start with a status 2; we can identify them with a conditional window sum; then, every time a status 2 is met within the island, we retrieve the row num of the corresponding status 1.

An alternative is apply - although it might be less efficient, since it requires re-opening the table in the subquery:

select t.*, x.*
from #temp t
outer apply (
    select max(x.rownum) value
    from #temp x
    where x.id = t.id
      and x.rownum < t.rownum
      and x.status = 'status1'
      and t.status = 'status2'
) x
order by t.id, t.rownum

相关问题