I have a table like this:
CREATE TABLE Rates
(
RateGroup int NOT NULL,
Rate decimal(5, 2) NOT NULL,
DueDate date NOT NULL
);
This table contains rates which are valid from a certain due date to the day before the next due date. If no next due date is present, the rate's validity has no end. There can be multiple consecutive due days with the same rate and a certain can appear on different non consecutive due days as well.
The rates are divided into groups. A single due date can appear in multiple groups but only once per group.
Here's some example data:
INSERT INTO Rates(RateGroup, Rate, DueDate)
VALUES
(1, 1.2, '20210101'), (1, 1.2, '20210215'), (1, 1.5, '20210216'),
(1, 1.2, '20210501'), (2, 3.7, '20210101'), (2, 3.7, '20210215'),
(2, 3.7, '20210216'), (2, 3.7, '20210501'), (3, 2.9, '20210101'),
(3, 2.5, '20210215'), (3, 2.5, '20210216'), (3, 2.1, '20210501');
RateGroup | Rate | DueDate |
---|---|---|
1 | 1.20 | 2021-01-01 |
1 | 1.20 | 2021-02-15 |
1 | 1.50 | 2021-02-16 |
1 | 1.20 | 2021-05-01 |
2 | 3.70 | 2021-01-01 |
2 | 3.70 | 2021-02-15 |
2 | 3.70 | 2021-02-16 |
2 | 3.70 | 2021-05-01 |
3 | 2.90 | 2021-01-01 |
3 | 2.50 | 2021-02-15 |
3 | 2.50 | 2021-02-16 |
3 | 2.10 | 2021-05-01 |
Now I want a query which folds multiple consecutive rows of a rate group with the same rate to a single row containing the date range (start and end date) where the rate is valid.
This is the desired result:
| RateGroup | Rate | StartDate | EndDate |
| ------------ | ------------ | ------------ | ------------ |
| 1 | 1.20 | 2021-01-01 | 2021-02-15 |
| 1 | 1.50 | 2021-02-16 | 2021-04-30 |
| 1 | 1.20 | 2021-05-01 | NULL |
| 2 | 3.70 | 2021-01-01 | NULL |
| 3 | 2.90 | 2021-01-01 | 2021-02-14 |
| 3 | 2.50 | 2021-02-15 | 2021-04-30 |
| 3 | 2.10 | 2021-05-01 | NULL |
How can I achieve this?
1条答案
按热度按时间qxsslcnc1#
This can be done with Common Table Expressions uilizing the
OVER
Clause as in the following query:How does it work?
Here we are using
LAG()
to compare the current rate to it's predecessor.PARTITION BY RateGroup
makes sure that we don't mix rate groups andORDER BY DueDate
determines the order at which we look at the rows.If the current rate is equal to it's predecessor we mark the row with a 0 otherwise with a 1. The result of this CTE would look like so for the first rate group:
| RateGroup | Rate | DueDate | IsBegin |
| ------------ | ------------ | ------------ | ------------ |
| 1 | 1.20 | 2021-01-01 | 1 |
| 1 | 1.20 | 2021-02-15 | 0 |
| 1 | 1.50 | 2021-02-16 | 1 |
| 1 | 1.20 | 2021-05-01 | 1 |
0 and 1 are no arbitrary values; they are needed for the next step.
In this CTE we build a running total over the
IsBegin
column withSUM()
. Due to the value being 1 at the start of a new range and 0 within a range our running total increments always at the begin of a new range. This leads to an uniqe number for each range.With
LEAD()
we add the day before the next due date in our range to the output. The result of this step for the first rate group is then:| RateGroup | Rate | StartDate | EndDate | RangeID |
| ------------ | ------------ | ------------ | ------------ | ------------ |
| 1 | 1.20 | 2021-01-01 | 2021-02-14 | 1 |
| 1 | 1.20 | 2021-02-15 | 2021-02-15 | 1 |
| 1 | 1.50 | 2021-02-16 | 2021-04-30 | 2 |
| 1 | 1.20 | 2021-05-01 | NULL | 3 |
Now that we have an unique identifier (
RangeID
) for the date ranges we can do a simple aggregation withGROUP BY
to get our desired result. Since the ranges can be open ended (no next due date) we useto make sure that
NULL
is always treated as the latest possible date.