oracle 识别数据随时间的变化

使用Oracle数据库11.2.
问题：比较两个来源的数据，仅显示差异。
我正在寻找一种非常巧妙的解决方案来自动化数百个表的比较，每个表都有数百个列，这些表将在Crystal Reports中开发的报表中的查询上下文中工作。是的，我已经考虑到我在某个地方转错了弯（但不是Crystal Reports部分。我坚持认为这一点。），在这一点之后的描述都毫无意义。
先不考虑查询或报表性能。我打算强制筛选器限制在单个请求中可以处理的数据量。我在这里要问的是如何使其通用。换句话说，我不想在查询代码中列出任何特定的列，除了区分已知的分组或查找列--updated_by、updated_date等等。我希望查询能自动收集这些名字。
为了简单起见，假设我想根据筛选条件比较表中分组内相邻行的数据，下面是输入数据的简化示例：

with source_data as (
  select 'a' grp
  , 'b' b
  , 'c' c
  , date '2022-12-01' record_date
  , 'joe' updated_by
  from dual
  union all
  select 'a'
  , 'b'
  , 'd'
  , date '2022-12-02'
  , 'sally' updated_by
  from dual
  union all
  select 'a'
  , 'a'
  , 'd'
  , date '2022-12-04'
  , 'joe' updated_by
  from dual
  union all
  select 'z' a
  , 'b' b
  , 'c' c
  , date '2022-12-01'
  , 'joe' updated_by
  from dual
  union all
  select 'z'
  , 'e'
  , 'c'
  , date '2022-12-08'
  , 'joe' updated_by
  from dual
  union all
  select 'z'
  , 'f'
  , 'c'
  , date '2022-12-09'
  , 'sally' updated_by
  from dual
)

| 玻璃钢|乙|C级|记录日期|更新人|
| - ------|- ------|- ------|- ------|- ------|
| 项目a|b.人口基金|（c）秘书长的报告|2022年12月1日上午00时00分|乔|
| 项目a|b.人口基金|日|2022年12月2日00时00分|萨利|
| 项目a|项目a|日|2022年12月4日00时00分|乔|
| z|b.人口基金|（c）秘书长的报告|2022年12月1日上午00时00分|乔|
| z|电子|（c）秘书长的报告|2022年12月8日00时00分|乔|
| z|f级|（c）秘书长的报告|2022年12月9日00时00分|萨利|
我们需要查看某些类别的人员做了哪些更改。在本例中，假设Sally是该组的成员，Joe不是。因此，我只关心第2行和第6行的更改。但我需要将每一行与前一行进行比较，因此...

,
changed as (
  select sd.*
  from source_data sd
  where updated_by = 'sally'
),
changes as (
  select 'current' as status
  , c.*
  from changed c
  union all
  select 'previous'
  , sd.grp
  , sd.b
  , sd.c
  , c.record_date
  , c.updated_by
  from source_data sd
    inner join changed c on c.grp = sd.grp
                        and sd.record_date = (select max(record_date) from source_data where grp = c.grp and record_date < c.record_date)
)

这个小例子的输出看起来很简单，但是当我需要比较数百行和数百列时，识别变化就不那么容易了。
我有许多表要比较，它们都有相同的问题。许多表有数百列。通常，差异只存在于一列或几列中。
这将在一个报表中完成。我没有创建函数或存储过程的权限，所以我怀疑我可以以任何方式使用动态SQL。这可能有类似于开发视图的约束。
我没有使用PL/SQL。（有点厌倦了几乎每一个与我在SO上搜索有关的Oracle问题，这些问题与PL/SQL有一定的关系，但没有办法过滤掉它们。）
我在想，为了比较数据，我首先要将其解透视以获得行上的列/值对...
（根据对这一问题的回答：ORACLE unpivot columns to rows）

, unpivot as (
        Select *
        From (
          Select grp
          , status
          , record_date
          , updated_by
          , Case When C.lvl = 1 Then 'B'
              When C.lvl = 2 Then 'C'
            End col
          , Case When C.lvl = 1 Then coalesce(B, '<null>')
              When C.lvl = 2 Then coalesce(C, '<null>')
            End val
          From changes
          cross join (
            select level lvl
            from dual
            connect by level <= 2
          ) c
        )
        where val is not null
        order by 1, 3, 2 desc
)

(Yes，对于重要的数据，我需要将进入val的数据转换为更通用的数据，比如字符串。）
但是，我如何以编程方式确定列数、列顺序，并为col的值和val中CASE语句的引用生成列名？
我想我可以使用类似这样的东西作为解决方案的一部分：

SELECT COLUMN_NAME
, COLUMN_ID
 
FROM ALL_tab_columns
 
WHERE OWNER = 'MY_OWNER_NAME'
  AND TABLE_NAME = 'SOURCE_TABLE'
 
ORDER BY COLUMN_ID

但我不确定如何在不涉及动态SQL的情况下以有意义的方式将其与解决方案相吻合，我非常确定我做不到这一点。而且它可能需要基于序号位置引用列，这在SQL中似乎是不可能的。当然，如果这样做可行，我可以使用类似的查询来弄清楚如何处理val列的数据类型。
然后，我需要透视它，以显示不同列中的之前和之后的值，然后我可以过滤它，只显示更改的内容。

, 
pivot as (
  select grp
  , record_date
  , col
  , updated_by
  , max("'previous'") val_prev
  , max("'current'") val_curr
  from unpivot
  pivot (
     max(val)
     for status
     in (
          'previous',
          'current'
     )
  )
  group by grp
  , record_date
  , col
  , updated_by
)
select grp
, record_date
, col
, updated_by
, val_prev
, val_curr
from pivot
where val_curr <> val_prev
order by grp
, record_date

| 玻璃钢|记录日期|颜色|更新人|确认_前一个|确认当前|
| - ------|- ------|- ------|- ------|- ------|- ------|
| 项目a|2022年12月2日00时00分|C级|萨利|（c）秘书长的报告|日|
| z|2022年12月9日00时00分|乙|萨利|电子|f级|

您不能只使用纯SQL来实现这一点，但是您可以使用SQL宏在单个语句中实现所需的功能--前提是您使用的是最新版本的Oracle数据库。
下面是一个动态取消透视宏的示例，该宏将所有 * 未列出 * 的列转换为行：

create or replace function unpivot_macro (
  tab       dbms_tf.table_t,
  keep_cols dbms_tf.columns_t
) return clob sql_macro is
  sql_stmt     clob;
  unpivot_list clob;
  select_list  clob;
begin

  for col in tab.column.first .. tab.column.last loop
    if tab.column ( col ).description.name 
         not member of keep_cols then
      unpivot_list := unpivot_list || 
        ',' || tab.column ( col ).description.name;
    end if;
    
    select_list := select_list || 
      ', to_char (' || tab.column ( col ).description.name || ') as ' || 
      tab.column ( col ).description.name;
  end loop;
  
  sql_stmt := 
    'select * from ( 
       select ' || trim ( both ',' from select_list ) || ' from tab 
     )
     unpivot (
       val for col 
       in ( ' || trim ( both ',' from unpivot_list ) || ' )
     )';

  return sql_stmt;
  
end unpivot_macro;
/

select * from unpivot_macro ( 
  source_data, columns ( grp, updated_by, record_date )
);

GRP    RECORD_DATE          UPDATED_BY    COL    VAL    
a      01-DEC-2022 00:00    joe           B      b      
a      01-DEC-2022 00:00    joe           C      c      
a      02-DEC-2022 00:00    sally         B      z      
a      02-DEC-2022 00:00    sally         C      d      
a      04-DEC-2022 00:00    joe           B      a      
a      04-DEC-2022 00:00    joe           C      d
...

如果避免使用PL/SQL的原因是您没有创建函数的权限，则可以将宏放在with子句中。
下面是一个运行在21.3上的例子：

with function unpivot_macro (
  tab       dbms_tf.table_t,
  keep_cols dbms_tf.columns_t
) return clob sql_macro is
  sql_stmt     clob;
  unpivot_list clob;
  select_list  clob;
begin

  for col in tab.column.first .. tab.column.last loop
    if tab.column ( col ).description.name 
         not member of keep_cols then
      unpivot_list := unpivot_list || 
        ',' || tab.column ( col ).description.name;
    end if;
    
    select_list := select_list || 
      ', to_char (' || tab.column ( col ).description.name || ') as ' || 
      tab.column ( col ).description.name;
  end loop;
  
  sql_stmt := 
    'select * from ( 
       select ' || trim ( both ',' from select_list ) || ' from tab 
     )
     unpivot (
       val for col 
       in ( ' || trim ( both ',' from unpivot_list ) || ' )
     )
     where status is not null';

  return sql_stmt;
  
end unpivot_macro;
  source_data as (
  select 'a' grp, 'b' b, 'c' c, date '2022-12-01' record_date, 'joe' updated_by
  from dual union all
  select 'a', 'z', 'd', date '2022-12-02', 'sally' updated_by  
  from dual union all
  select 'a', 'a', 'd', date '2022-12-04', 'joe' updated_by  
  from dual union all
  select 'z' a, 'b' b, 'c' c, date '2022-12-01', 'joe' updated_by
  from dual union all
  select 'z', 'e', 'c', date '2022-12-08', 'joe' updated_by
  from dual union all
  select 'z', 'f', 'c', date '2022-12-09', 'sally' updated_by
  from dual
), changes as (
  select s.grp, b, c,
         'sally' updated_by, 
         case 
           when updated_by = 'sally' then record_date
           else lead ( record_date ) over ( partition by grp order by record_date ) 
         end record_date,
         case 
           when updated_by = 'sally' then 'current' 
           when lead ( updated_by ) over ( partition by grp order by record_date ) = 'sally'
           then 'previous' 
         end status
  from   source_data s
)
  select * from unpivot_macro (
    changes, columns ( grp, record_date, updated_by, status )
  )
  pivot (
    max ( val ) for status 
    in ( 'previous' prev_val, 'current' curr_val )
  )
  where  prev_val <> curr_val;

G UPDAT RECORD_DATE        C P C
- ----- ------------------ - - -
a sally 02-DEC-22          B b z
a sally 02-DEC-22          C c d
z sally 09-DEC-22          B e f

oracle 识别数据随时间的变化

1条答案

相关问题

热门标签

最新问答