我正在尝试捕获来自防火墙的警报名称。每个日志具有以下格式:
datetime alertname severity_level username endpoint_name domain
我目前使用的RegEx适用于除第三个日志之外的所有日志。有什么办法解决吗?
regex = []
text = """2023-05-27 / 23:06:31 Computer account added/changed/deleted. medium ANONYMOUS LOGON PC-CR5$ SRVDC2 ACME 1
2023-05-27 / 23:28:08 Computer account added/changed/deleted. medium ANONYMOUS LOGON SRVXAP02$ SRVDC2 ACME 1
2023-05-28 / 02:24:29 User account locked out multiple login errors high SRVDC2$ john.smith.admin SRVDC2 \\\\NECBROWSER 1
2023-05-28 / 05:01:48 Computer account added/changed/deleted. medium ANONYMOUS LOGON SRVNPS01$ SRVDC1 ACME 1
2023-05-28 / 06:38:57 Computer account added/changed/deleted. medium ANONYMOUS LOGON VD-OPERATOR1$ SRVDC1 ACME 1"""
pattern = '(?:(?<=\d{2}:\d{2}:\d{2}))(.*)(?=\.)|(?=medium )|(?=high )|(?=low )|(?=critical )'
regex.append(re.findall(pattern,text,re.MULTILINE))
print(regex)
电流输出
[[' Computer account added/changed/deleted', '', ' Computer account added/changed/deleted', '', ' User account locked out multiple login errors high SRVDC2$ john.smith', ' Computer account added/changed/deleted', '', ' Computer account added/changed/deleted', '']]
预期输出
[[' Computer account added/changed/deleted', '', ' Computer account added/changed/deleted', '', ' User account locked out multiple login errors', ' Computer account added/changed/deleted', '', ' Computer account added/changed/deleted', '']]
1条答案
按热度按时间gwo2fgha1#
你可以用
参见a demo on regex101.com。
与您最初的尝试相反,这个尝试使用了一个非捕获组(lookbehind是“昂贵的”!)和一个惰性量词构造。使用第一个捕获组即可。
在
Python
中,这可能是并且会屈服
参见a demo on ideone.com。