Data Lineage in SQL Server

qltillow  于 2023-05-21  发布在  SQL Server
关注(0)|答案(2)|浏览(152)

Objective :

Let's think of a large scale enterprise where we have a heterogeneous data stores such as SQL servers, No-SQL stores, Big data stores like ADL, ADF..etc spreads across different business groups.

Our objective is to build a lineage service at an enterprise level, which will provide insights on the data lineage between different data store instances within an enterprise. This will help us in gaining insights on the data proliferation within the enterprise.

For this, as a phase 1 we want to take SQL server and want to build the lineage within and across SQL servers and It's databases.

Is there a way we can build / extract (if available) the data lineage (table & Column level) from a SQL server by leveraging SQL profiling or transaction logs within or across instances.

I looked at Determining Impact and Data Lineage and looks like the SQL server should have the Dependency Services enabled.

As a enterprise level service, I may have access to the SQL Servers and may not have control over the sql server instances on enabling any services.

Could anyone share any insights or experiences with respect to building or extracting Data Lineage within or across SQL Servers.

Thanks in Advance !

Mydeen

68bkxrlz

68bkxrlz1#

You may export database structure into a SQL file which includes CREATE TABLE/CREATE VIEW/ CREATE PROCEDURE and other SQL statements. Then using a SQL parser to analyze those scripts to get metadata from those SQL scripts and build data lineage of the table/columns.

Take this SQL for example:

create view deptsal (dept_no, dept_name, salary)
as
SELECT d.deptno, 
       d.dname, 
       SUM(e.sal + Nvl(e.comm, 0)) AS sal 
FROM   dept d 
       left join (SELECT * 
                  FROM   emp 
                  WHERE  hiredate > DATE '1980-01-01') e 
              ON e.deptno = d.deptno 
GROUP  BY d.deptno, 
          d.dname;

You will get the data flow like: deptsal.salary depends on emp.sal, emp.com

yv5phkfx

yv5phkfx2#

Hello Peer Mohamed Mydeen, Have you ever been considering to use a third-party tool to do that? As far as I know, there are plenty of professional data lineage tools if you just Google "sql data lineage tool". In case of you choose to use these existing data lineage tool, what you need to do is just provide some access to the lineage tool and let it finish the rest of your work. It would be easy and time-saving.

You can check this page for some paid-tool: https://www.dpriver.com/blog/2022/05/11/best-data-lineage-tools/

You can also check this blog for open-source tool: https://www.dpriver.com/blog/2022/07/29/best-open-source-data-lineage-tools/

相关问题