regex并从捕获组中排除特定文本

unftdfkk  于 2023-10-22  发布在  其他
关注(0)|答案(2)|浏览(76)

我有以下来自思科的

access-list office extended permit tcp host 1.1.1.1 host 2.2.2.2
access-list home extended permit object-group PROTOS4 host 4.4.4.4 host 5.5.5.5

我试图写一个解析器,我有以下代码在python

acl_general_structure = (
    r'access-list\s+(?P<policy_name>[A-Za-z0-9\-\_]+)\s+extended\s+(?P<action>permit|deny)'
    r'\s'
    r'(?P<protocol>[a-zA-Z0-9]+|(?:object-group\s[A-Za-z\d]+))'
    r'\s'
    r'host\s(?P<source>(?:[0-9]{1,3}\.){3}[0-9]{1,3})'
    r'\s'
    r'host\s(?P<destination>(?:[0-9]{1,3}\.){3}[0-9]{1,3})'
)

f_in_name="xx.config"
f_out_name=f_in_name + ".csv"

with open(f_in_name, "r", encoding="utf8") as f:
    for line in f.readlines():
        result=re.match(acl_general_structure,line)
        if result:
            print(result.groupdict())

使用当前代码,输出为:

{'policy_name': 'office', 'action': 'permit', 'protocol': 'tcp', 'source': '1.1.1.1', 'destination': '2.2.2.2'}
{'policy_name': 'home', 'action': 'permit', 'protocol': 'object-group PROTOS4', 'source': '4.4.4.4', 'destination': '5.5.5.5'}

我想要的是

{'policy_name': 'office', 'action': 'permit', 'protocol': 'tcp', 'source': '1.1.1.1', 'destination': '2.2.2.2'}
{'policy_name': 'home', 'action': 'permit', 'protocol': 'PROTOS4', 'source': '4.4.4.4', 'destination': '5.5.5.5'}

这意味着“object-group”字符串已从捕获组中删除。这实际上是可能的,或者我需要消化这个单独的trhough python分裂,而工作的disconary proptocol '}值?我知道如何在python中处理字符串,但想在regex级别上处理它。

oaxa6hgo

oaxa6hgo1#

变化

r'(?P<protocol>[a-zA-Z0-9]+|(?:object-group\s[A-Za-z\d]+))'

r'(?:object-group\s)?(?P<protocol>[a-zA-Z0-9]+)'
ssm49v7z

ssm49v7z2#

谢谢你,谢谢
很好。现在我有了这个:

regex_ip_address = ( 
    r'([0-9]{1,3}\.){3}[0-9]{1,3}'
)

acl_general_structure = (
    r'access-list\s+(?P<policy_name>[\w\-]+)\s+extended\s+(?P<action>permit|deny)'
    r'\s'
    r'(?P<protocol>(\w+)|object-group\s[\w]+)'
    r'\s'
    r'(?:host\s|(?:object-group\s))?(?P<source>({ipaddr})\s?(?:{ipaddr})?|any)'
    r'\s'
    r'(?:host\s|(?:object-group\s))?(?P<destination>({ipaddr})\s?(?:{ipaddr})?|any)'
    r'(?:\s|$)'
    r'(?:(?:object-group|eq)?\s?(?P<service>[\w]+))?'.format(ipaddr=regex_ip_address)
)

而且效果很好

相关问题