我需要使用seleniumpython将表数据放在一个类似于map的键值对中

webghufk  于 2021-07-14  发布在  Java
关注(0)|答案(1)|浏览(311)

我需要使用seleniumpython将下表数据格式化为以下方式。需要将该数据保存在Map中,并需要将其与其他表数据进行比较。
表格数据:https://www.w3schools.com/html/html_tables.asp

{"Country": ["Germany", "Mexico", "Austria", "UK", "Canada", "Italy"]
 "Company": ["Alfreds Futterkiste", "Centro comercial Moctezuma", soon..]
 "Contact": ["Maria Anders", "Francisco Chang", soon..]
}

我尝试了以下代码,但得到了以下输出:只显示第一个值。
谁能告诉我怎么做?

[('Company', 'Alfreds Futterkiste'), ('Contact', 'Maria Anders'), ('Country', 'Germany')]
from selenium import webdriver

header = []
body = []
driver = webdriver.Chrome()
driver.get("https://www.w3schools.com/html/html_tables.asp")
table = driver.find_elements_by_css_selector("table#customers tbody tr th")
tbody = driver.find_elements_by_css_selector("table#customers tbody tr td")
for row in table:
    header.append(row.text)
for t in tbody:
    body.append(t.text)
result = zip(header, body)
result_list = list(result)
print(result_list)
driver.quit()
smdncfj3

smdncfj31#

使用 WebDriverWait() 并等待表可见,然后使用以下逻辑来定义字典中的表数据。

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

dictitem={}
driver = webdriver.Chrome()
driver.get("https://www.w3schools.com/html/html_tables.asp")
WebDriverWait(driver,5).until(EC.element_to_be_clickable((By.ID,"customers")))
table = driver.find_elements_by_css_selector("table#customers tbody tr th")
for i in range(len(table)):
    dictitem[table[i].text]=[item.text for item in driver.find_elements_by_xpath("//table[@id='customers']//tbody//tr//td[{}]".format(i+1))]

print(dictitem)

输出:

{'Company': ['Alfreds Futterkiste', 'Centro comercial Moctezuma', 'Ernst Handel', 'Island Trading', 'Laughing Bacchus Winecellars', 'Magazzini Alimentari Riuniti'], 'Country': ['Germany', 'Mexico', 'Austria', 'UK', 'Canada', 'Italy'], 'Contact': ['Maria Anders', 'Francisco Chang', 'Roland Mendel', 'Helen Bennett', 'Yoshi Tannamuri', 'Giovanni Rovelli']}

选项2:您也可以使用python。

import pandas as pd

df=pd.read_html("https://www.w3schools.com/html/html_tables.asp")[0]
print(df.to_dict())

相关问题