python regex -捕获多行组

qcbq4gxm 于 2023-01-06 发布在 Python

关注(0)|答案(1)|浏览(118)

我想使用regex（Python3）提取每个NAME_组的信息。

AB_ NAME_ 111 "fruit";
AB_ EX_ 111 first_fruit "banana";
AB_ EX_ 111 second_fruit_info "Do you like 
apple

or grape?";
AB_ EX_ 111 third_fruit "tomato";
AB_ NAME_ 120 "food";
AB_ NAME_ 130 "clothes";
AB_ EX_ 130 first_clothes "t-shirt";

我想得到的结果是三组
（一）

AB_ NAME_ 111 "fruit";
AB_ EX_ 111 first_fruit "banana";
AB_ EX_ 111 second_fruit_info "Do you like 
apple

or grape?";
AB_ EX_ 111 third_fruit "tomato";

AB_ NAME_ 120 "food";

AB_ NAME_ 130 "clothes";
AB_ EX_ 130 first_clothes "t-shirt";

它们是由它们的ID（Name_ ID）分割的。我将非常感谢任何建议。谢谢。
我尝试捕获AB_ NAME_信息，后面跟随零个或多个AB_ EX_信息，如下所示，但失败了。我还使用了“re. S”、“re.M”标志，但效果不佳。

AB_ NAME_ \d+ .+;\n(AB_ EX_ \d+ (.|\n)+;\n)*

regex

来源：https://stackoverflow.com/questions/75013283/python-regex-capture-mullti-line-groups

1条答案

按热度按时间

wnavrhmk1#

您应该使用re.DOTALL使所有下一个线符号与.匹配，然后您可以使用findall()获得所有结果，如下所示：

import re

text = """AB_ NAME_ 111 "fruit";
AB_ EX_ 111 first_fruit "banana";
AB_ EX_ 111 second_fruit_info "Do you like
apple

or grape?";
AB_ EX_ 111 third_fruit "tomato";
AB_ NAME_ 120 "food";
AB_ NAME_ 130 "clothes";
AB_ EX_ 130 first_clothes "t-shirt";"""

regex = r"AB_ NAME_.*?(?=AB_ NAME_|$)"

print(re.findall(regex, text, re.DOTALL))

正则表达式模式如下：AB_ NAME_.*?(?=AB_ NAME_|$)
这部分(?=AB_ NAME_|$)搜索下一个AB_ NAME_或行尾（在本例中是整个字符串的行尾）。

赞(0）回复(0）举报 2023-01-06

我来回答

python regex -捕获多行组

1条答案

相关问题

热门标签

最新问答