I've got a table where the value in each row can be one of two values, eg 0 or 1. Each row also has an ID
identity column.
What's the most efficient way to query the table to return the min and max ID
values for each contiguous group of values?
Example:
declare @tbl table (ID INT IDENTITY(1,1), Value INT);
insert into @tbl (Value)
values (1), (1), (1), (0), (0), (1), (0), (0), (1), (1), (1), (1);
Table contents:
ID Value
1 1
2 1
3 1
4 0
5 0
6 1
7 0
8 0
9 1
10 1
11 1
12 1
What I'd like to output from the query:
GroupID Value MinID MaxID
1 1 1 3
2 0 4 5
3 1 6 6
4 0 7 8
5 1 9 12
I've worked out a query to return these results. However, it's inefficient as it has to run over the table 4x, twice to get the MinID
values and twice again for the MaxID
values. As the real table I'll be running this query over has over 10 million rows, I'm looking for a more efficient query.
Here's the query I've worked out:
declare @maxID INT;
select @maxID = MAX(ID)
from @tbl;
select min.RowID AS GroupID, min.Value, min.MinID, max.MaxID
from
(
select
ROW_NUMBER() OVER (ORDER BY x.ID) AS RowID,
x.Value, x.ID as MinID
from @tbl x
left join @tbl ld on x.ID = ld.ID + 1
and x.Value <> ld.Value
where x.ID = 1
or ld.ID is not null
) min
join
(
select
ROW_NUMBER() OVER (ORDER BY x.ID) AS RowID,
x.Value, x.ID as MaxID
from @tbl x
left join @tbl trl on x.ID = trl.ID - 1
and x.Value <> trl.Value
where trl.ID is not null
or x.ID = @maxID
) max
on min.RowID = max.RowID
order by GroupID;
Is there a better way to achieve the same results?
1条答案
按热度按时间vsikbqxv1#
Try the following which uses lag to first identify where each group of values changes, then a running sum to create a sequential value for each group; with that in place you can then simply aggregate:
fiddle Demo