Python -在Python 3中获取head请求的最有效方法

jjjwad0x 于 2023-08-08 发布在 Python

关注(0)|答案(1)|浏览(105)

我发现了这段代码，对我来说它似乎是可靠和高效的，但不幸的是，它是针对python2的，而且它使用了urllib2，而每个人都说请求更快。在python3中，下面的代码（或者更有效或更可靠的代码）是什么？

#!/usr/bin/env python
#-*- coding:utf-8 -*-

import sys
import urllib2

# This script uses HEAD requests (with fallback in case of 405)
# to follow the redirect path up to the real URL
# (c) 2012 Filippo Valsorda - FiloSottile
# Released under the GPL license

class HeadRequest(urllib2.Request):
    def get_method(self):
        return "HEAD"

class HEADRedirectHandler(urllib2.HTTPRedirectHandler):
    """
    Subclass the HTTPRedirectHandler to make it use our
    HeadRequest also on the redirected URL
    """
    def redirect_request(self, req, fp, code, msg, headers, newurl):
        if code in (301, 302, 303, 307):
            newurl = newurl.replace(' ', '%20')
            newheaders = dict((k,v) for k,v in req.headers.items()
                              if k.lower() not in ("content-length", "content-type"))
            return HeadRequest(newurl,
                               headers=newheaders,
                               origin_req_host=req.get_origin_req_host(),
                               unverifiable=True)
        else:
            raise urllib2.HTTPError(req.get_full_url(), code, msg, headers, fp)

class HTTPMethodFallback(urllib2.BaseHandler):
    """
    Fallback to GET if HEAD is not allowed (405 HTTP error)
    """
    def http_error_405(self, req, fp, code, msg, headers):
        fp.read()
        fp.close()

        newheaders = dict((k,v) for k,v in req.headers.items()
                          if k.lower() not in ("content-length", "content-type"))
        return self.parent.open(urllib2.Request(req.get_full_url(),
                                         headers=newheaders,
                                         origin_req_host=req.get_origin_req_host(),
                                         unverifiable=True))

# Build our opener
opener = urllib2.OpenerDirector()
for handler in [urllib2.HTTPHandler, urllib2.HTTPDefaultErrorHandler,
                HTTPMethodFallback, HEADRedirectHandler,
                urllib2.HTTPErrorProcessor, urllib2.HTTPSHandler]:
    opener.add_handler(handler())

response = opener.open(HeadRequest(sys.argv[1]))

print(response.geturl())

字符串
顺便说一句，头的请求实际上不是我需要的。我只想知道如果链接是坏的（在一些网站，如果你给予他们一个坏代码，他们会重定向你回到网站的主页，我希望我的代码也认识到这一点）和头请求是最有效的解决方案，来到我的脑海中，所以如果你知道任何更好的方法，我也会感激。

python-3.x

来源：https://stackoverflow.com/questions/46484194/python-the-most-efficient-way-to-get-head-request-in-python-3

1条答案

按热度按时间

f8rj6qna1#

看看请求：http://docs.python-requests.org/en/master/
要执行一个HEAD请求，只需执行：

import requests

r = requests.head('http://www.example.com')

字符串
然后，您可以访问该对象以获取所需的内容。例如，状态代码：

print r.status_code

型

更新：如果你想检查一个页面是否是活动的，你需要执行GET请求。我见过这样的情况：HEAD请求返回200响应，而在同一个URL上，GET请求返回500

赞(0）回复(0）举报 2023-08-08

我来回答

Python -在Python 3中获取head请求的最有效方法

1条答案

相关问题

热门标签

最新问答