regex 查找源和目标的正则表达式

koaltpgm  于 2023-06-25  发布在  其他
关注(0)|答案(2)|浏览(111)

需要帮助编写泛型正则表达式,以便从以下场景中查找源表:
样品1:

INSERT INTO a.test
(
test1,
test2
)
select 
test1,
test2
from 
xyz.test

输出1:

xyz.test

样品2:

*******************************************************                                                                                
    MACRO                                                                 
        abc.test
    PURPOSE                                   
         TO DELETE THE DATA FROM abc.test
    INPUT PARAMS                              
                   NONE                                                                      
    CALLED BY                                                                            
        THIS MACRO IS BEING CALLED FROM sql
*******************************************************/ 
    DELETE FROM sample.test

没有此作为示例的输出。测试是目标表而不是源。
我已经尝试了下面的正则表达式\b(?!DELETE\s+FROM\s+)(?:FROM|JOIN)\s+(\w+./*[^\s]+)它的工作为第一种情况下,但第二种情况下,我期待没有输出,但我得到的样本。测试

jfgube3f

jfgube3f1#

我真的不认为正则表达式是安全的方法。要处理的事情真的太多了,比如评论,工会等等
我用SQLGlot库玩了一把,代码如下:

##python3 -m pip install sqlglot

from sqlglot import parse_one, exp

queries = [
"""
INSERT INTO a.test
(
test1,
test2
)
select 
test1,
test2
from 
xyz.test
""",
"""
/*******************************************************                                                                                
    MACRO                                                                 
        abc.test
    PURPOSE                                   
         TO DELETE THE DATA FROM abc.test
    INPUT PARAMS                              
                   NONE                                                                      
    CALLED BY                                                                            
        THIS MACRO IS BEING CALLED FROM sql
*******************************************************/ 
    DELETE FROM sample.test
""",
"""
SELECT
  t1.firstname,
  t1.lastname,
  t2.email,
  t3.order
FROM abc.table1 t1
LEFT JOIN abc.table2 t2 ON t2.id = t1.id
LEFT JOIN abc.table3 t3 ON t3.email = t2.email
""",
"""
SELECT id, name FROM table1
UNION
SELECT id, name FROM table2
"""
]

for query in queries:
    print(query)
    sourceTables = []
    for select in parse_one(query).find_all(exp.Select):
        for table in select.find_all(exp.Table):
            sourceTables.append(str(table))
    print("Source tables:")
    print(sourceTables)
    print("\n")

它输出以下内容:

INSERT INTO a.test
(
test1,
test2
)
select 
test1,
test2
from 
xyz.test

Source tables:
['xyz.test']


/*******************************************************                                                        

    MACRO
        abc.test
    PURPOSE
         TO DELETE THE DATA FROM abc.test
    INPUT PARAMS
                   NONE
    CALLED BY
        THIS MACRO IS BEING CALLED FROM sql
*******************************************************/
    DELETE FROM sample.test

Source tables:
[]


SELECT
  t1.firstname,
  t1.lastname,
  t2.email,
  t3.order
FROM abc.table1 t1
LEFT JOIN abc.table2 t2 ON t2.id = t1.id
LEFT JOIN abc.table3 t3 ON t3.email = t2.email

Source tables:
['abc.table1 AS t1', 'abc.table2 AS t2', 'abc.table3 AS t3']


SELECT id, name FROM table1
UNION
SELECT id, name FROM table2

Source tables:
['table1', 'table2']

似乎如你所愿...

pgpifvop

pgpifvop2#

下面将从提供的两个示例中获取值。
需要注意的是,“delete”“from”“join” 是该行上的第一个非空白字符序列。

(?is)^\s*(?:delete\s+)?(?:from|join)\s*.*?([^\s]+)
  • (?is)* 将切换到 * 忽略大小写 * 和 * 单行 * 模式。

输出量

xyz.test
sample.test

相关问题