regex 前瞻捕获不需要的字符

cig3rfwq  于 2023-05-30  发布在  其他
关注(0)|答案(1)|浏览(218)

我正在尝试捕获来自防火墙的警报名称。每个日志具有以下格式:

datetime alertname severity_level username endpoint_name domain

我目前使用的RegEx适用于除第三个日志之外的所有日志。有什么办法解决吗?

regex = []

text = """2023-05-27 / 23:06:31 Computer account added/changed/deleted. medium ANONYMOUS LOGON PC-CR5$ SRVDC2 ACME 1
2023-05-27 / 23:28:08 Computer account added/changed/deleted. medium ANONYMOUS LOGON SRVXAP02$ SRVDC2 ACME 1
2023-05-28 / 02:24:29 User account locked out multiple login errors high SRVDC2$ john.smith.admin SRVDC2 \\\\NECBROWSER 1
2023-05-28 / 05:01:48 Computer account added/changed/deleted. medium ANONYMOUS LOGON SRVNPS01$ SRVDC1 ACME 1
2023-05-28 / 06:38:57 Computer account added/changed/deleted. medium ANONYMOUS LOGON VD-OPERATOR1$ SRVDC1 ACME 1"""

pattern = '(?:(?<=\d{2}:\d{2}:\d{2}))(.*)(?=\.)|(?=medium )|(?=high )|(?=low )|(?=critical )'
regex.append(re.findall(pattern,text,re.MULTILINE))
print(regex)

电流输出

[[' Computer account added/changed/deleted', '', ' Computer account added/changed/deleted', '', ' User account locked out multiple login errors high SRVDC2$ john.smith', ' Computer account added/changed/deleted', '', ' Computer account added/changed/deleted', '']]

预期输出

[[' Computer account added/changed/deleted', '', ' Computer account added/changed/deleted', '', ' User account locked out multiple login errors', ' Computer account added/changed/deleted', '', ' Computer account added/changed/deleted', '']]
gwo2fgha

gwo2fgha1#

你可以用

\d{2}:\d{2}:\d{2}\s+
(.*?)
\s(?:medium|high|low|critical)

参见a demo on regex101.com
与您最初的尝试相反,这个尝试使用了一个非捕获组(lookbehind是“昂贵的”!)和一个惰性量词构造。使用第一个捕获组即可。
Python中,这可能是

import re

text = """2023-05-27 / 23:06:31 Computer account added/changed/deleted. medium ANONYMOUS LOGON PC-CR5$ SRVDC2 ACME 1
2023-05-27 / 23:28:08 Computer account added/changed/deleted. medium ANONYMOUS LOGON SRVXAP02$ SRVDC2 ACME 1
2023-05-28 / 02:24:29 User account locked out multiple login errors high SRVDC2$ john.smith.admin SRVDC2 \\\\NECBROWSER 1
2023-05-28 / 05:01:48 Computer account added/changed/deleted. medium ANONYMOUS LOGON SRVNPS01$ SRVDC1 ACME 1
2023-05-28 / 06:38:57 Computer account added/changed/deleted. medium ANONYMOUS LOGON VD-OPERATOR1$ SRVDC1 ACME 1"""

pattern = re.compile(r'''
    \d{2}:\d{2}:\d{2}\s+
    (.*?)
    \s(?:medium|high|low|critical)

''', re.VERBOSE)

messages = [match.group(1) for match in pattern.finditer(text)]
print(messages)

并且会屈服

['Computer account added/changed/deleted.', 'Computer account added/changed/deleted.', 'User account locked out multiple login errors', 'Computer account added/changed/deleted.', 'Computer account added/changed/deleted.']

参见a demo on ideone.com

相关问题