python 如何使用urlparse()获取任何url的特定部分？

epfja78i 于 2023-01-24 发布在 Python

关注(0)|答案(4)|浏览(135)

我有一个类似的网址

url = 'https://grabagun.com/firearms/handguns/semi-automatic-handguns/glock-19-gen-5-polished-nickel-9mm-4-02-inch-barrel-15-rounds-exclusive.html'

当我使用urlparse()函数时，得到的结果如下：

>>> url = urlparse(url) 
>>> url.path
'/firearms/handguns/semi-automatic-handguns/glock-19-gen-5-polished-nickel-9mm-4-02-inch-barrel-15-rounds-exclusive.html'

有没有可能得到这样的东西：
path1 =“枪支”
path2 =“手枪”
path3 =“半自动手枪”
我不想看到任何以.html结尾的文本。

python

来源：https://stackoverflow.com/questions/74880162/how-to-get-specific-part-of-any-url-using-urlparse

4条答案

按热度按时间

hkmswyz61#

你有一些单一的/和一些路径有// ...首先替换所有相同的，如果你想直接应用于网址。对于url.path，你可以直接这样做

url = '/firearms/handguns/semi-automatic-handguns/glock-19-gen-5-polished-nickel-9mm-4-02-inch-barrel-15-rounds-exclusive.html'

url = url.split('/')
url = list(filter(None, url))#remove empty elemnt
url.pop()
print(url)

输出列表号

['firearms', 'handguns', 'semi-automatic-handguns']

第二部分

如果你想让它们成为变量，那么简单地遍历它们并创建变量

for n, val in enumerate(url):
    globals()["path%d"%n] = val

print(path1)

输出：

handguns

赞(0）回复(0）举报 2023-01-24

hc2pp10m2#

path_list = url.path.split('/')

if ".html" in path_list[-1]:
    path_list = path_list[:-1]

将为您提供一个列表，每个部分作为一个条目，如果最后一个部分包含.html，则将其排除。
您可以根据自己的需要或用例的具体/通用程度来编辑它。

赞(0）回复(0）举报 2023-01-24

mrwjdhj33#

您可以将它们全部放入一个数组中，用/分隔它们

url.path.split('/')

如果你想把它们放在path 1，path 2等等，你可以把列表中的值赋给变量。

path1, path2, path3 = url.path.split('/')[:3]

我只把它放在列表的前3个值上，如果你不想用.html的文本，你总是可以得到最后一个值的索引，并在列表切片中使用它，就像这样。

paths = url.path.split('/')
if '.html' in paths[-1]:
    html_text_index = paths.index(paths[-1])
no_html_paths = paths[:html_text_index]

赞(0）回复(0）举报 2023-01-24

dgiusagp4#

解决问题的一个简单方法是：

path=urlparse(url).path[1:]

splittedpath=[sp for sp in path.split("/") if not sp.endswith(".html")]
"""
['firearms', 'handguns', 'semi-automatic-handguns']
"""

您可以通过以下方式访问这些文件：

print(splittedpath[0]) # 0,1,2... 
# firearms

我们在这里所做的是，通过执行path.path[1:]删除路径的第一个字符串“/”，使用.split("/")从每次出现的“/”中拆分字符串路径，并检查拆分的字符串是否以“.html”结尾，如果没有，则保存它。

赞(0）回复(0）举报 2023-01-24

我来回答

python 如何使用urlparse()获取任何url的特定部分？

4条答案

相关问题

热门标签

最新问答