带条件的t-sql分组数据

jdg4fx2g  于 2021-07-29  发布在  Java
关注(0)|答案(3)|浏览(373)

我有一些数据:
期间| id |账户|浮动|价值
24217 | 303003 | 0
24218 | 303003 | 0
24219 | 303003 | 1
24220 | 303003 | 1
24221 | 303003 | 0
24222 | 303003 | 0
我需要这样对这些数据进行分组:

begin_period | end_period| id_account | float_value     
24217       | 24218     | 303003    |   0   
24219       | 24220     | 303003    |   1   
24221       | 24222     | 303003    |   0

我尝试了按分区和while循环的行数,但没有成功。

ycggw6v2

ycggw6v21#

我把这理解为一个gaps和isald问题,您需要将“相邻”行组合在一起,即具有相同属性的行 float_value ,跨具有相同 id_account 以及 parameter .
在这里,我认为最简单的方法是使用行号之间的差异来计算每条记录属于哪个组:

select
    min(period) begin_period,
    max(period) end_period,
    id_account,
    parameter,
    float_value
from (
    select 
        t.*,
        row_number() over(partition by id_account, parameter order by period) rn1,
        row_number() over(partition by id_account, parameter, float_value order by period) rn2
    from mytable t
) t
group by id_account, parameter, float_value, rn1 -rn2
order by id_account, parameter, begin_period

db小提琴演示:

begin_period | end_period | id_account | parameter      | float_value
-----------: | ---------: | ---------: | :------------- | ----------:
       24217 |      24218 |     303003 | ACCOUNT_STATUS |           0
       24219 |      24220 |     303003 | ACCOUNT_STATUS |           1
       24221 |      24222 |     303003 | ACCOUNT_STATUS |           0
i34xakig

i34xakig2#

这是一个缺口和孤岛问题,一种方法是使用行号差异法:

WITH cte AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY ID_Account ORDER BY Period) rn1,
        ROW_NUMBER() OVER (PARTITION BY ID_Account, Float_Value ORDER BY Period) rn2
    FROM yourTable
)

SELECT
    MIN(Period) AS Begin_Period,
    MAX(Period) AS End_Period,
    ID_Account,
    MAX(Parameter) AS Parameter,   -- assuming Parameter just always has the same value
    Float_Value
FROM cte
GROUP BY
    ID_Account,
    Float_Value,
    rn1 - rn2
ORDER BY
    MIN(Period);

演示

5cnsuln7

5cnsuln73#

你可以用 row_number 如下所示。这是演示。

with cte as
(
  select
     *,
     rn - row_number() over(order by Float_Value, rn) as nrnk
  from
  (
    select
      *,
      row_number() over (order by Period) as rn
    from Table1
  ) subq
)

select
  min(Period) as Begin_Period,
  max(Period) as End_Period,
  ID_Account,
  Parameter,
  Float_Value
from cte
group by
  ID_Account,
  Parameter,
  Float_Value,
  nrnk

输出:

Begin_Period   End_Period     ID_Account     Parameter   Float_Value
---------------------------------------------------------------------
 24217            24218          303003    ACCOUNT_STATUS   0
 24221            24222          303003    ACCOUNT_STATUS   0
 24219            24220          303003    ACCOUNT_STATUS   1

相关问题