regex 用python正则表达式匹配可选全字的最佳方法是什么

3zwtqj6y 于 2023-03-04 发布在 Python

关注(0)|答案(1)|浏览(93)

我经常使用regualr表达式，但通常是以同样相似的方式。我有时会遇到这样的情况，我想捕捉字符串与可选的整个单词在他们。我想出了下面的方法，但我怀疑有一个更好的方法，只是不知道它是什么？一个例子是这样的字符串：
For the purposes of this order, the sum of $5,476,958.00 is the estimated total costs of the initial unit well covered hereby as dry hole and for the purposes of this order, the sum of $12,948,821.00 is the estimated total costs of such initial unit well as a producing well
我的目标是捕获字符串中以美元符号$开头、以单词dry或prod结尾的两个部分。在示例中，整个单词是producing，但有时它是单词的变体，例如production，因此prod是合适的。捕获的结果应该是：
['$5,476,958.00 is the estimated total costs of the initial unit well covered hereby as dry', '$12,948,821.00 is the estimated total costs of such initial unit well as a prod']
我用一个不太优雅的表达方式来表达
[val[0] for val in re.findall('(\$[0-9,\.]+[a-z ,]+total cost.*?(dry|prod)+)', line, flags=re.IGNORECASE)]
有没有比这更好、更正确的方法来实现它呢？

regex

来源：https://stackoverflow.com/questions/75566389/what-is-the-best-way-to-match-optional-whole-words-with-python-regex

1条答案

按热度按时间

bvhaajcl1#

我们可以在这里使用re.findall：

inp = "For the purposes of this order, the sum of $5,476,958.00 is the estimated total costs of the initial unit well covered hereby as dry hole and for the purposes of this order, the sum of $12,948,821.00 is the estimated total costs of such initial unit well as a producing well"
matches = re.findall(r'\$\d{1,3}(?:,\d{3})*(?:\.\d+)?.*?\b(?:dry|prod)', inp)
print(matches)

这将打印：

['$5,476,958.00 is the estimated total costs of the initial unit well covered hereby as dry',
 '$12,948,821.00 is the estimated total costs of such initial unit well as a prod']

下面是对所使用的正则表达式模式的解释：

\$匹配货币符号$
\d{1,3}匹配1到3位数字
(?:,\d{3})*后跟可选的千位项
(?:\.\d+)?后跟可选的小数部分
.*?匹配所有内容，直到达到最近
\b(?:dry|prod)将dry或prod匹配为子字符串

赞(0）回复(0）举报 2023-03-04

我来回答

regex 用python正则表达式匹配可选全字的最佳方法是什么

1条答案

相关问题

热门标签

最新问答