有没有可能让√
原封不动地通过这个还是我要求太多了
import urllib.request
path = 'html'
links = 'links'
with open(links, 'r', encoding='UTF-8') as links:
for link in links: #for each link in the file
print(link)
with urllib.request.urlopen(link) as linker: #get the html
print(linker)
with open(path, 'ab') as f: #append the html to html
f.write(linker.read())
链接
https://myanimelist.net/anime/27899/Tokyo_Ghoul_√A
输出
File "PYdown.py", line 7, in <module>
with urllib.request.urlopen(link) as linker:
File "/usr/lib64/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib64/python3.6/urllib/request.py", line 526, in open
response = self._open(req, data)
File "/usr/lib64/python3.6/urllib/request.py", line 544, in _open
'_open', req)
File "/usr/lib64/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/usr/lib64/python3.6/urllib/request.py", line 1392, in https_open
context=self._context, check_hostname=self._check_hostname)
File "/usr/lib64/python3.6/urllib/request.py", line 1349, in do_open
encode_chunked=req.has_header('Transfer-encoding'))
File "/usr/lib64/python3.6/http/client.py", line 1254, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib64/python3.6/http/client.py", line 1265, in _send_request
self.putrequest(method, url, **skips)
File "/usr/lib64/python3.6/http/client.py", line 1132, in putrequest
self._output(request.encode('ascii'))
UnicodeEncodeError: 'ascii' codec can't encode character '\u221a' in position 29: ordinal not in range(128)
2条答案
按热度按时间p4tfgftt1#
你需要在URL中引用Unicode字符。你有一个文件,其中包含你需要打开的URL列表,所以你需要拆分每个URL (使用
urllib.parse.urlsplit()
),引用 (使用urllib.parse.quote()
) 主机和路径的每一部分 (拆分路径,你可以使用pathlib.PurePosixPath.parts
),然后形成URL回来 (使用urllib.parse.urlunsplit()
)。用法:
输出:
gopyfrb32#
为了让python输出
√
,我不得不将√
转换为%E2%88%9A
,而不是让python将√
读取为它自己信用证@Olvin Right