regex re.findall()对于具有单个捕获组后跟量词[duplicate]的模式的行为

niknxzdl 于 2023-11-20 发布在其他

关注(0)|答案(2)|浏览(88)

此问题在此处已有答案：

re.findall behaves weird（3个答案）
15天前关门了。
如果模式只有一个量化的捕获组，re.findall（）似乎不会返回实际的匹配。
例如：

p1 = r"(apple)*"
t1 = "appleappleapple"

re.findall(p1, t1) # returns "['apple', '']"

字符串
而使用相同的参数，

[i.group() for i in re.finditer(p1, t1)]

型
产生精确匹配，即['appleappleapple', '']
另一件让我困惑的事情是这种行为：

t2 = "appleapplebananaapplebanana"

re.findall(p1, t2) will return "['apple', '', '', '', '', '', '', 'apple', '', '', '', '', '', '', '']"

型
这些多余的空字符串到底是从哪里来的？为什么findall（）在输入字符串结束之前捕获它们？

regex

来源：https://stackoverflow.com/questions/77400426/re-findalls-behavior-for-patterns-with-a-single-capturing-group-followed-by-a

2条答案

按热度按时间

unhi4e5o1#

我相信@Deepak的回答并没有完全解决这个问题。
让我们看看第一个代码片段：

p1 = r"(apple)*"
t1 = "appleappleapple"

re.findall(p1, t1)  # returns "['apple', '']"

字符串
让我们澄清一下我们的期望。我曾期望上面代码段的输出是['appleappleapple'，'']。这是因为findall应该greenhouse匹配到最后，并且因为它只提供非重叠匹配，所以唯一的其他匹配应该是空字符串。

但是，为什么输出不同？

正如文档中提到的，如果字符串中存在一个或多个组，则返回的是这些组。这就是为什么您获得apple作为匹配，而不是appleappleapple。
现在，关于第三个片段，我相信Deepak的回答确实解决了这个问题。然而，为了完整起见，我也会在这里提到它：

t2 = "appleapplebananaapplebanana"

re.findall(p1, t2) will return "['apple', '', '', '', '', '', '', 'apple', '', '', '', '', '', '', '']"

型
因为你使用了*，它将匹配0个或更多的组。这就是为什么你得到所有这些空字符串。

赞(0）回复(0）举报 2023-11-20

agyaoht72#

让我们先试着理解 * with（）是如何工作的。这个正则表达式，试图匹配“前面的字符或组出现零次或多次”的模式。

p1 = r"(banana)*"
t1 = "apple"
res = re.findall(p1, t1) # returns ['', '', '', '', '', '']
print(len(t1)) # returns 5 
print(len(res)) #returns 6

字符串
现在从len返回后，我们知道为什么会有这些空格，它试图匹配每个“前面的字符”和组。这就是为什么它返回6个空字符串不匹配（5个字符+ 1个组）。
现在我们如何删除那些空字符串。好吧，我已经玩了一点模式，这是我发现的。在删除（）之后。

p1 = r"apple*"
t1 = "appleappleapple"
res = re.findall(p1, t1) #returns ['apple', 'apple', 'apple']

型
同样使用相同的模式

[i.group() for i in re.finditer(p1, t1)] # returns ['apple', 'apple', 'apple']

t2 = "appleapplebananaapplebanana"
re.findall(p1, t2) #returns ['apple', 'apple', 'apple']

型

赞(0）回复(0）举报 2023-11-20

我来回答

regex re.findall()对于具有单个捕获组后跟量词[duplicate]的模式的行为

2条答案

相关问题

热门标签

最新问答