regex re.search和re.match之间有什么区别？

uklbhaso 于 2022-11-18 发布在其他

关注(0)|答案(9)|浏览(421)

Python re模块中的search()和match()函数有什么区别？
我读过Python 2文档（Python 3文档），但我似乎从来都记不住它。我不得不不断地查找它并重新学习它。我希望有人能用例子清楚地回答它，这样（也许）它就会留在我的脑海中。或者至少我会有一个更好的地方来回答我的问题，而且重新学习它会花费更少的时间。

regex

来源：https://stackoverflow.com/questions/180986/what-is-the-difference-between-re-search-and-re-match

9条答案

按热度按时间

jdgnovmf1#

re.match被锚定在字符串的开头，这与换行符无关，所以它与在模式中使用^是不同的。
正如re.match文档所述：
如果字符串开头的零个或多个字符与正则表达式模式匹配，则返回相应的MatchObject示例。如果字符串与模式不匹配，则返回None;注意，这与零长度匹配不同。
注意：如果要在string中的任何位置找到匹配项，请改用search()。
re.search会搜寻整个字串，如文件所述：

扫描字符串查找正则表达式模式产生匹配项的位置，并返回相应的MatchObject示例。如果字符串中没有与模式匹配的位置，则返回None;注意，这与在字符串中的某个点查找零长度匹配不同。

所以如果你需要匹配字符串的开头，或者匹配整个字符串，使用match，这样比较快，否则使用search。
该文档有一个专门的部分介绍match与search，其中还介绍了多行字符串：
Python基于正则表达式提供了两种不同的原语操作：match只在字符串的开头检查匹配，而search在字符串的任何地方检查匹配（这是Perl的默认操作）。
请注意，即使使用以'^'开头的正则表达式，match也可能与search不同：'^'仅匹配字符串的开头，或者在MULTILINE模式下也匹配紧跟在换行符后面的字符串。* 只有当模式匹配字符串的start（无论模式如何），或者匹配可选的pos参数指定的起始位置（无论是否在其前面有换行符）时，“match“操作才会成功。
现在，说得够多了。是时候看一些示例代码了：

# example code:
string_with_newlines = """something
someotherthing"""

import re

print re.match('some', string_with_newlines) # matches
print re.match('someother', 
               string_with_newlines) # won't match
print re.match('^someother', string_with_newlines, 
               re.MULTILINE) # also won't match
print re.search('someother', 
                string_with_newlines) # finds something
print re.search('^someother', string_with_newlines, 
                re.MULTILINE) # also finds something

m = re.compile('thing$', re.MULTILINE)

print m.match(string_with_newlines) # no match
print m.match(string_with_newlines, pos=4) # matches
print m.search(string_with_newlines, 
               re.MULTILINE) # also matches

赞(0）回复(0）举报 2022-11-18

7uhlpewt2#

search ⇒在字符串中的任意位置查找内容并返回匹配对象。
match ⇒在字符串的 * 开头 * 查找内容并返回一个匹配对象。

赞(0）回复(0）举报 2022-11-18

hs1rzwqc3#

匹配比搜索快得多，因此，regex.search如果要处理数百万个样本，您可以使用regex.match（（.？）word（.？））来代替www. example. com（“word”），并获得大量的性能。
在上面被接受的答案下，@ivan_bilan的这条评论让我想到，如果这样的 * 黑客 * 真的能加速任何东西，那么让我们看看你真的能获得多少吨的性能。
我准备了以下测试套件：

import random
import re
import string
import time

LENGTH = 10
LIST_SIZE = 1000000

def generate_word():
    word = [random.choice(string.ascii_lowercase) for _ in range(LENGTH)]
    word = ''.join(word)
    return word

wordlist = [generate_word() for _ in range(LIST_SIZE)]

start = time.time()
[re.search('python', word) for word in wordlist]
print('search:', time.time() - start)

start = time.time()
[re.match('(.*?)python(.*?)', word) for word in wordlist]
print('match:', time.time() - start)

我做了10次测量（1 M、2 M、......、10 M字），得出了以下曲线：

如您所见，搜索模式'python'比匹配模式'(.*?)python(.*?)'快。

Python是聪明的。避免试图变得更聪明。*

赞(0）回复(0）举报 2022-11-18

szqfcxe24#

re.search在整个字符串中搜索模式**，而re.match * 不搜索 * 该模式;如果不匹配，则除了在字符串的开头匹配它之外没有其他选择。

赞(0）回复(0）举报 2022-11-18

e4eetjau5#

您可以参考以下示例来了解re.match和www.example.com的工作方式re.search

a = "123abc"
t = re.match("[a-z]+",a)
t = re.search("[a-z]+",a)

re.match将返回none，但re.search将返回abc。

赞(0）回复(0）举报 2022-11-18

q8l4jmvw6#

不同之处在于，re.match()会误导那些习惯于 Perl、grep 或 sed 正则表达式匹配的人，而re.search()不会。：-）
更严肃地说，As John D. Cook remarks，re.match()“表现得好像每个模式都有^ prepended”。换句话说，re.match('pattern')等于re.search('^pattern')。所以它锚定了模式的左边。但它也 * 不锚定模式的右边：* 仍然需要终止的$。
坦率地说，考虑到上述情况，我认为re.match()应该被弃用。我很想知道它应该被保留的原因。

赞(0）回复(0）举报 2022-11-18

fiei3ece7#

短得多：

search扫描整个字符串。
match只扫描字符串的开头。

下面Ex说：

>>> a = "123abc"
>>> re.match("[a-z]+",a)
None
>>> re.search("[a-z]+",a)
abc

赞(0）回复(0）举报 2022-11-18

edqdpe6u8#

re.match会尝试在字符串的开头匹配模式。re.search会尝试在整个字符串中匹配模式，直到找到匹配项。

赞(0）回复(0）举报 2022-11-18

yzxexxkh9#

快速回答

re.search('test', ' test')      # returns a Truthy match object (because the search starts from any index) 

re.match('test', ' test')       # returns None (because the search start from 0 index)
re.match('test', 'test')        # returns a Truthy match object (match at 0 index)

赞(0）回复(0）举报 2022-11-18

我来回答

regex re.search和re.match之间有什么区别？

9条答案

相关问题

热门标签

最新问答