如何使用Python从该页面提取URL和特定列？

sd2nnvve 于 2022-10-23 发布在 Python

关注(0)|答案(1)|浏览(155)

https://training.lczero.org/networks/?show_all=1升
我想从这个网站中提取名为Number、Run、Network、Elo、Games的列。我可以使用Pandas做到这一点，但是pd.read_html（）函数无法提取下载数据所需的href值。我试着用BeautifulSoup，但没有落地。我设法得到了所有的url，但我还需要其他列来理解它。有人能帮忙吗？

pandas

来源：https://stackoverflow.com/questions/74163380/how-do-i-extract-the-url-and-specific-columns-from-this-page-using-python

1条答案

按热度按时间

xxls0lw81#

尝试：

import requests
import pandas as pd
from bs4 import BeautifulSoup

url = "https://training.lczero.org/networks/?show_all=1"
soup = BeautifulSoup(requests.get(url).content, "html.parser")

df = pd.read_html(str(soup))[0]
df["links"] = [
    "https://training.lczero.org" + a["href"] for a in soup.select("td > a")
]

print(df.head())

打印：

Number  Run   Network     Elo  Games  Blocks  Filters                        Time  Ordo Elo                                                                                                         links
0  805799    1  a13e6d41  141.26  12533      15      512  2022-10-22 12:33:33 +00:00         0  https://training.lczero.org/get_network?sha=a13e6d412e4d7a113ca604647a6f56845ad280b5584ede96ca6a7658dba7f897
1  805798    1  d6eea775  138.51  63008      15      512  2022-10-22 11:57:32 +00:00         0  https://training.lczero.org/get_network?sha=d6eea77581d45a0e3bc46203baa10eb94b7e345e15c246f0d18b98b9d5d425f6
2  805797    1  cdffe453  133.00  65478      15      512  2022-10-22 11:20:34 +00:00       133  https://training.lczero.org/get_network?sha=cdffe45321e8a843eabc7c6ee71254647b31b5a8798440035ee2b222acc3162a
3  805796    1  6271053e  131.00  66486      15      512  2022-10-22 10:43:30 +00:00       131  https://training.lczero.org/get_network?sha=6271053e90de21c67a25ba23981d8f03e888a4f7afe543f736a057ebb5d07fec
4  805795    1  0b03a5b0  136.00  63894      15      512  2022-10-22 10:07:32 +00:00       136  https://training.lczero.org/get_network?sha=0b03a5b0dbc019e936f075e6f5eacc603d888970e56bb12c6e747b05fda09b86

赞(0）回复(0）举报 2022-10-23

我来回答

如何使用Python从该页面提取URL和特定列？

1条答案

相关问题

热门标签

最新问答