Python脚本从url中提取表并将其输出到csv中

k4ymrczo  于 2023-06-27  发布在  Python
关注(0)|答案(1)|浏览(102)

我想从这个网站上得到表格,https://caniwin.com/poker/omahahilopreALL.php
我想写一个python脚本来获取这些数据,并将其放入csv中,这样我就可以按WinHi %排序
我目前拥有的脚本就是这么做的

import requests
import csv
from bs4 import BeautifulSoup

# Fetch the HTML content from the website
url = 'https://caniwin.com/poker/omahahilopreALL.php'
response = requests.get(url)
html_content = response.text

# Parse the HTML
soup = BeautifulSoup(html_content, 'html.parser')

# Find the table
table = soup.find('table')

print(table)

这样就可以很好地打印表格。问题是,由于我使用的是libre office,当我试图解析并将其放入逗号分隔的文件时,它看起来像janky或不起作用。
例如,这个脚本不会以我可以按我想要的值排序的方式输出它

import requests
import csv
from bs4 import BeautifulSoup

# Fetch the HTML content from the website
url = 'https://caniwin.com/poker/omahahilopreALL.php'
response = requests.get(url)
html_content = response.text

# Parse the HTML
soup = BeautifulSoup(html_content, 'html.parser')

# Find the table
table = soup.find('table')

# Extract table data
table_data = []
for row in table.find_all('tr'):
    row_data = []
    for cell in row.find_all(['td']):
        row_data.append(cell.text.strip())
    table_data.append(row_data)

# Output table to CSV file
filename = 'output.csv'
with open(filename, 'w', newline='') as csvfile:
    writer = csv.writer(csvfile)
    writer.writerows(table_data)

print(f"Table data has been saved to {filename}")
y4ekin9u

y4ekin9u1#

您可以使用pandas创建数据框架并将其保存到CSV:

import requests
import pandas as pd
from bs4 import BeautifulSoup

url = 'https://caniwin.com/poker/omahahilopreALL.php'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')

for td in soup.table.tr.find_all('td'):
    td.name = 'th'
df = pd.read_html(str(soup.table))[0]

print(df.head())
df.to_csv('data.csv', index=False)

图纸:

Rank   Hole Cards  Overall  WinHi %  TieHi %  WinLo %  TieLo %  Occur %  Cumul %
0     1  Ax Ay 3x 2y  30.7361  18.9909   0.3723  28.0416  14.9648   0.0044   0.0044
1     2  Ax Ay 4x 2y  29.0354  19.1242   0.5388  25.2638  13.4865   0.0044   0.0088
2     3  Ax A- 3x 2-  27.7786  14.6640   0.3909  28.0942  15.0578   0.0088   0.0177
3     4  Ax A- 3- 2x  27.7640  14.6603   0.3902  28.0779  15.0054   0.0088   0.0266
4     5  Ax Ay 5x 2y  27.3725  19.2939   0.6131  22.5001  11.7472   0.0044   0.0310

并保存data.csv(来自LibreOffice的屏幕截图):

相关问题