regex 如果正则表达式未找到匹配项，则返回“Error

oewdyzsn 于 2023-01-27 发布在其他

关注(0)|答案(3)|浏览(193)

我有一个字符串：

link = "http://www.this_is_my_perfect_url.com/blah_blah/blah_blah?=trololo"

我有一个函数，返回域名从该网址或如果它没有找到，返回''：

def get_domain(url):
    domain_regex = re.compile("\:\/\/(.*?)\/|$")
    return re.findall(domain_regex, str(url))[0].replace('www.', '')

get_domain(link)

返回结果：

this_is_my_perfect_url.com

|$返回''，如果正则表达式不匹配任何内容。
有没有办法在正则表达式中实现默认值Error，这样我就不必在函数中做任何检查了？
如果link = "there_is_no_domain_in_here"，则函数返回Error，而不是''。

regex

来源：https://stackoverflow.com/questions/56357454/return-error-if-no-match-found-by-regex

3条答案

按热度按时间

hmae6n7t1#

正如上面的注解中提到的，您不能在regex中设置任何内容来为您执行此操作，但是您可以检查re.findall在应用额外格式后返回的输出是否为空，如果为空，则意味着没有找到匹配，返回Error

import re
link = "http://www.this_is_my_perfect_url.com/blah_blah/blah_blah?=trololo"

def get_domain(url):
    domain_regex = re.compile("\:\/\/(.*?)\/|$")

    #Get regex matches into a list after data massaging
    matches = re.findall(domain_regex, str(url))[0].replace('www.', '')

    #Return the match or Error if output is empty
    return matches or 'Error'

print(get_domain(link))
print(get_domain('there_is_no_domain_in_here'))

输出将为

this_is_my_perfect_url.com
Error

赞(0）回复(0）举报 2023-01-27

ma8fv8wu2#

我只想说一句--懒惰量词（.*?）和交替词（|$）结合使用是非常低效的。

://[^/]+

此外，从Python 3.8开始，您可以使用walrus运算符，如下所示

if (m := re.search("://[^/]+", your_string)) is not None:
    # found sth.
else
    return "Error"

不，仅仅使用正则表达式，你不能从一个根本不存在的字符串中得到东西。

赞(0）回复(0）举报 2023-01-27

xpszyzbs3#

为什么不使用urlparse来获取域呢？

# env python 2
# import urlparse
# python 3
from urllib.parse import urlparse

def get_domain(url):
    parsed_uri = urlparse(url)
    domain = parsed_uri.netloc
    return domain or "ERROR"

url = 'there_is_no_domain_in_here'
print(get_domain(url))

赞(0）回复(0）举报 2023-01-27

我来回答

regex 如果正则表达式未找到匹配项，则返回“Error

3条答案

相关问题

热门标签

最新问答