scrapy 验证废弃HTTP代理

oprakyz7  于 2022-11-09  发布在  其他
关注(0)|答案(2)|浏览(161)

我可以使用request. meta['proxy']设置一个http代理,但是我如何验证代理呢?
这不适用于指定用户和传递:

request.meta['proxy'] = 'http://user:pass@123.456.2323:2222'

从周围看,我可能必须发送request.headers ['Proxy-Authorization'],但是我应该用什么格式发送呢?

wwtsj6pe

wwtsj6pe1#

用户名和密码采用base64编码,格式为“用户名:密码”

import base64

# Set the location of the proxy

proxy_string = choice(self._get_proxies_from_file('proxies.txt')) # user:pass@ip:port
proxy_items = proxy_string.split('@')
request.meta['proxy'] = "http://%s" % proxy_items[1]

# setup basic authentication for the proxy

user_pass=base64.encodestring(proxy_items[0])
request.headers['Proxy-Authorization'] = 'Basic ' + user_pass
r1zk6ea1

r1zk6ea12#

w3lib模块有一个非常方便的功能用于此用例。

from w3lib.http import basic_auth_header

request.meta["proxy"] = "http://192.168.1.1:8050"
request.headers["Proxy-Authorization"] = basic_auth_header(proxy_user, proxy_pass)

Zyte的一个blog article中也提到了这一点(Scrapy的维护者)

相关问题