regex 如何使用正则表达式捕获定义模式后的单词/数字2个空格

ua4mk5z4  于 2023-10-22  发布在  其他
关注(0)|答案(3)|浏览(144)

我有一个正文:

text = 'DEFINING PATHWAY /n \s 126 HI. DEPT. INIT\n E. OFFICER\'S SIGNATURE\n N/A\n EQUIPMENT 1x23\n N/A\ A.  B.  C.  D.  E.  F.  G.  FAULT REPAIR12 _ 23 E. OFFICER\'S SIGNATURE\n 00039501\n EQUIPMENT_SERVER_COMPUTER 12 85\n 1362\n \n 7\n 1\n 2\n CV031\n \n 4. METER\nSECTION IV SUMMARY\n 46. SPEC PURPOSE  A.  B.  C.  D.  E.  F.  G.  H.\n38. MAINT. MAN\n 39. RATE\n 40. SUPERVISOR\n 41. PRI\n 42. T/A\nC. DIV. INIT\n D. DEPT. INIT\n E. OFFICER\'S SIGNATURE\n X4549803078\n EQUIPMENT3_4d HTA CHIP\n X&6a\n'

我想找到的话和数字2空格后OFFICER\'S SIGNATURE\n每次和拉的文字和数字,直到下一个\n
例如:OFFICER\'S SIGNATURE\n N/A\n EQUIPMENT 1x23\n
跳过N/A\n并查找EQUIPMENT 1x23
本文中的预期匹配项

matches = ['EQUIPMENT 1x23', 'EQUIPMENT_SERVER_COMPUTER 12 85', 'EQUIPMENT3_4d HTA CHIP']

下面是我尝试过的返回空列表的不成功模式列表

equipment_pattern =  r"OFFICER\'S SIGNATURE\n(?:. *?\n)*(.+?)\n"
equipment_pattern =  r"OFFICER\'S SIGNATURE\n[^\n]*\n(.+?)\n"
equipment_pattern =  r"OFFICER\'S SIGNATURE\n(?:[^\n]*\n)*(.+?)\n"
equipment_pattern = r"SIGNATURE\n[^\\n]*\\n((?:[^\\n]*\\n)*.+?)\\n"
equipment_pattern = r"OFFICER\'S SIGNATURE\\n([^\\n]+)"
equipment_pattern = r"OFFICER\'S SIGNATURE\n([^\n]*?)\n"
dxpyg8gm

dxpyg8gm1#

>>> equipment_pattern =  r"OFFICER'S SIGNATURE\n[^\n]*\n(.+)"
>>> re.findall(equipment_pattern, text)
[' EQUIPMENT 1x23', ' EQUIPMENT_SERVER_COMPUTER 12 85', ' EQUIPMENT3_4d HTA CHIP']

这将匹配OFFICER'S SIGNATURE,然后跳过另一行,然后捕获下一行。
最后不需要\n。通过删除?使最后一个.*贪婪,它将匹配到下一个换行符。

eqqqjvef

eqqqjvef2#

如果你使用Python优秀的PyPI regex module(大致类似于PCRE regex引擎,除了它也支持可变长度的lookbehind),你可以尝试用下面的正则表达式匹配字符串(它不使用捕获组):

regex.findall(r'\bOFFICER'S SIGNATURE\n [^ ]* \K.*', str)
  #=> [['EQUIPMENT 1x23', 'EQUIPMENT_SERVER_COMPUTER 12 85',
  #     'EQUIPMENT3_4d HTA CHIP']

Regex demo <$*()*/<$Python演示
指令\K(Python的常规正则表达式引擎不支持)导致匹配的开始被重置为引擎的字符串指针的当前位置,并丢弃任何以前使用的字符。

bxpogfeg

bxpogfeg3#

re.findall(r'EQUIPMENT[A-Z0-9_]*.*?(?=\n)', text)

['EQUIPMENT 1x23', 'EQUIPMENT_SERVER_COMPUTER 12 85', 'EQUIPMENT3_4d HTA CHIP']

相关问题