如何合并连续日期范围Oracle

gajydyqb  于 2023-08-03  发布在  Oracle
关注(0)|答案(2)|浏览(137)

我正面临一个问题。我不知道如何基于两个维度将连续的日期范围行合并在一起。一个对我来说还可以,但是第二个就麻烦了
让我们用四种可能的场景来想象这个结构中的table

emp_id  |  level  |  date_from   |   date_to    
--------------------------------------------------
    1     |   A     |  7/31/2015   |  3/31/2016
    1     |   A     |  4/1/2016    |  1/1/3000

    2     |   A     |  7/31/2015   |  1/1/3000

    3     |   A     |  5/31/2015   |  12/31/2015
    3     |   B     |  1/1/2016    |  3/31/2016
    3     |   A     |  4/1/2016    |  6/30/2016
    3     |   B     |  7/1/2016    |  1/1/3000

    4     |   A     |  5/31/2015   |  12/31/2015
    4     |   A     |  1/1/2016    |  6/30/2016
    4     |   B     |  7/1/2016    |  1/1/3000

字符串
我只想合并那些具有连续日期范围且act_level = prev_level的行
我试过做这样的事

SELECT emp_id
, level
, date_from
, date_to
--
, CASE
    WHEN lag(level) over (partition by emp_id order by date_from) = level THEN 
         CASE
             WHEN lag(date_to) over (partition by emp_id, level order by date_from) = date_from-1 
               THEN lag(date_from) over (partition by code_employee, level_name order by date_from)
             ELSE NULL
         END
    ELSE 
         CASE
             WHEN lag(level) over (partition by emp_id order by date_from) = level
                     OR
                  lead(level) over (partition by emp_id order by date_from) = level
                THEN NULL
             ELSE date_from
         END
  END date_from_new
, date_to as date_to_new
--
FROM src_table
--
WHERE 1=1


这几乎给了我想要的结果:

emp_id  |  level  |  date_from   |   date_to   |  d_from_new | d_from_to 
--------------------------------------------------------------------------
    1     |   A     |  7/31/2015   |  3/31/2016  |           | 3/31/2016
    1     |   A     |  4/1/2016    |  1/1/3000   | 7/31/2015 | 1/1/3000

    2     |   A     |  7/31/2015   |  1/1/3000   | 7/31/2015 | 1/1/3000

    3     |   A     |  5/31/2015   |  12/31/2015 | 5/31/2015 | 12/31/2015
    3     |   B     |  1/1/2016    |  3/31/2016  |  1/1/2016 | 3/31/2016
    3     |   A     |  4/1/2016    |  6/30/2016  |  4/1/2016 | 6/30/2016  
    3     |   B     |  7/1/2016    |  1/1/3000   |  7/1/2016 | 1/1/3000 

    4     |   A     |  5/31/2015   |  12/31/2015 |           | 12/31/2015
    4     |   A     |  1/1/2016    |  6/30/2016  | 5/31/2015 | 6/30/2016
    4     |   B     |  7/1/2016    |  1/1/3000   | 7/1/2016  | 1/1/3000


我将只过滤d_from_new(date_from_new)的结果,而不是空值。但我不知道会发生什么,如果有例如3倍相同的水平与连续的日期范围,或8倍。
老实说-我不喜欢查询:)
你有什么“香水友好”和“眼睛友好”的解决方案吗?

efzxgjgh

efzxgjgh1#

请尝试以下查询:

select emp_id, lvl, min(date_from) df, max(date_to) dt
  from (
    select s2.*, rn - sum(marker) over (order by rn) as grp
      from (
        select s1.*,
               row_number() over (order by emp_id, date_from) rn,
               case when lag(lvl) over (partition by emp_id order by date_from) 
                         = lvl
                     and lag(date_to) over (partition by emp_id order by date_from) + 1 
                         = date_from
                    then 1
                    else 0
               end marker
          from src_table s1 ) s2 )
  group by emp_id, lvl, grp
  order by emp_id, min(date_from)

字符串
在第一个子查询S1中,我添加了标记,其中如果上一个级别是对应的并且日期是连续的,则分配1。在第二个子查询中,此标记用于构建GRP列,该列对所有匹配行具有相同的值。此列用于最终分组查询,以查找最小值date_from和最大值date_to。请分别运行内部查询,以查看每一步中发生的情况。测试是否有两个以上的连续行。
测试数据和输出:

create table src_table (emp_id number(6), lvl varchar2(2), date_from date, date_to date);
insert into src_table values (1, 'A', date '2015-07-31', date '2016-03-31');
insert into src_table values (1, 'A', date '2016-04-01', date '3000-01-01');
insert into src_table values (2, 'A', date '2015-07-31', date '3000-01-01');
insert into src_table values (3, 'A', date '2015-05-31', date '2015-12-31');
insert into src_table values (3, 'B', date '2016-01-01', date '2016-03-31');
insert into src_table values (3, 'A', date '2016-04-01', date '2016-06-30');
insert into src_table values (3, 'B', date '2016-07-01', date '3000-01-01');
insert into src_table values (4, 'A', date '2015-05-31', date '2015-12-31');
insert into src_table values (4, 'A', date '2016-01-01', date '2016-06-30');
insert into src_table values (4, 'B', date '2016-07-01', date '3000-01-01');

 EMP_ID LVL DF          DT
------- --- ----------- -----------
      1 A   2015-07-31  3000-01-01
      2 A   2015-07-31  3000-01-01
      3 A   2015-05-31  2015-12-31
      3 B   2016-01-01  2016-03-31
      3 A   2016-04-01  2016-06-30
      3 B   2016-07-01  3000-01-01
      4 A   2015-05-31  2016-06-30
      4 B   2016-07-01  3000-01-01

8 rows selected

eoigrqb6

eoigrqb62#

下面的SQL语句回答了似乎是预期的问题:识别连续间隔(date_from, date_to)和每个emp_id的水平lvl,并将它们合并到一行(emp_id、lvl、date_from、date_to)。
技巧是将不同的组ID分配给连续的(date_from, date_to)lvl值的组。整个故事是由2个内联视图完成的,tab0tab1
1.当我们跨越组边界时,tab0中的diff_levelsdiff_dates是非零的

  1. tab1中的ranked_levelsranked_dates生成diff字段的累积值(从而使lvl和(date_from,date_to)组不同)
    1.最后一个表仅分组(ranked_levels,ranked_groups)
with 
    -- tab0:     
    tab0 as (
        select
            emp_id,
            lvl,
            date_from,
            date_to,
            case 
                when 
                    lvl != lag(lvl, 1, lvl) 
                    over(partition by emp_id order by date_from)
                then 1
                else 0
            end diff_levels,
            date_from - lag(date_to, 1, date_from) 
            over(partition by emp_id order by date_from) - 1 diff_dates
        from src_table),
    -- tab1:
    tab1 as (
        select
            emp_id,
            lvl,
            date_from,
            date_to,
            sum(diff_levels) over(
                partition by emp_id
                order by date_from
                range between unbounded preceding and current row) ranked_levels,
            sum(diff_dates) over(
                partition by emp_id
                order by date_from 
                range between unbounded preceding and current row) ranked_dates
        from tab0)
select
    emp_id,
    lvl,
    min(date_from) date_from,
    max(date_to) date_to
from tab1
group by 
    emp_id,
    lvl,
    ranked_levels,
    ranked_dates
order by emp_id, lvl;

字符串
SQL在ORACLE 11g上进行了测试,但由于它是ANSI SQL,因此可以在任何地方运行。
我使用了前面回答中给出的小表格:

create table src_table (emp_id number(6), lvl varchar2(2), date_from date, date_to date);
insert into src_table values (1, 'A', date '2015-07-31', date '2016-03-31');
insert into src_table values (1, 'A', date '2016-04-01', date '3000-01-01');
insert into src_table values (2, 'A', date '2015-07-31', date '3000-01-01');
insert into src_table values (3, 'A', date '2015-05-31', date '2015-12-31');
insert into src_table values (3, 'B', date '2016-01-01', date '2016-03-31');
insert into src_table values (3, 'A', date '2016-04-01', date '2016-06-30');
insert into src_table values (3, 'B', date '2016-07-01', date '3000-01-01');
insert into src_table values (4, 'A', date '2015-05-31', date '2015-12-31');
insert into src_table values (4, 'A', date '2016-01-01', date '2016-06-30');
insert into src_table values (4, 'B', date '2016-07-01', date '3000-01-01');

相关问题