我有一个正文:
text = 'DEFINING PATHWAY /n \s 126 HI. DEPT. INIT\n E. OFFICER\'S SIGNATURE\n N/A\n EQUIPMENT 1x23\n N/A\ A. B. C. D. E. F. G. FAULT REPAIR12 _ 23 E. OFFICER\'S SIGNATURE\n 00039501\n EQUIPMENT_SERVER_COMPUTER 12 85\n 1362\n \n 7\n 1\n 2\n CV031\n \n 4. METER\nSECTION IV SUMMARY\n 46. SPEC PURPOSE A. B. C. D. E. F. G. H.\n38. MAINT. MAN\n 39. RATE\n 40. SUPERVISOR\n 41. PRI\n 42. T/A\nC. DIV. INIT\n D. DEPT. INIT\n E. OFFICER\'S SIGNATURE\n X4549803078\n EQUIPMENT3_4d HTA CHIP\n X&6a\n'
我想找到的话和数字2空格后OFFICER\'S SIGNATURE\n
每次和拉的文字和数字,直到下一个\n
例如:OFFICER\'S SIGNATURE\n N/A\n EQUIPMENT 1x23\n
跳过N/A\n
并查找EQUIPMENT 1x23
本文中的预期匹配项
matches = ['EQUIPMENT 1x23', 'EQUIPMENT_SERVER_COMPUTER 12 85', 'EQUIPMENT3_4d HTA CHIP']
下面是我尝试过的返回空列表的不成功模式列表
equipment_pattern = r"OFFICER\'S SIGNATURE\n(?:. *?\n)*(.+?)\n"
equipment_pattern = r"OFFICER\'S SIGNATURE\n[^\n]*\n(.+?)\n"
equipment_pattern = r"OFFICER\'S SIGNATURE\n(?:[^\n]*\n)*(.+?)\n"
equipment_pattern = r"SIGNATURE\n[^\\n]*\\n((?:[^\\n]*\\n)*.+?)\\n"
equipment_pattern = r"OFFICER\'S SIGNATURE\\n([^\\n]+)"
equipment_pattern = r"OFFICER\'S SIGNATURE\n([^\n]*?)\n"
3条答案
按热度按时间dxpyg8gm1#
这将匹配
OFFICER'S SIGNATURE
,然后跳过另一行,然后捕获下一行。最后不需要
\n
。通过删除?
使最后一个.*
贪婪,它将匹配到下一个换行符。eqqqjvef2#
如果你使用Python优秀的PyPI regex module(大致类似于PCRE regex引擎,除了它也支持可变长度的lookbehind),你可以尝试用下面的正则表达式匹配字符串(它不使用捕获组):
Regex demo <$*()*/<$Python演示
指令
\K
(Python的常规正则表达式引擎不支持)导致匹配的开始被重置为引擎的字符串指针的当前位置,并丢弃任何以前使用的字符。bxpogfeg3#