scrapy 一种Python皮刮痧桌

yjghlzjz 于 2022-11-09 发布在 Python

关注(0)|答案(1)|浏览(145)

经过一些后/得到请求使用scrapy我想出了这个页面：enter image description here
我想刮一下这张table。
前两个标题列与html/xmlpath中的类似：
1-（超文本链接）

<input type="submit" name="ctl00$cphAppMain$grdReportSales$ctl01$btntext_1" value="Adresse" id="ctl00_cphAppMain_grdReportSales_ctl01_btntext_1" title="Sorter på Adresse" class="greenbutton" style="height:25px;width:190px;">

1-（xpath）//*[@id=“ctl00_cphAppMain_grdReportSales_ctl01_btntext_1”]（请参阅第一章）
2-（超文本链接）

<input type="submit" name="ctl00$cphAppMain$grdReportSales$ctl01$btntext_2" value="Eierform" id="ctl00_cphAppMain_grdReportSales_ctl01_btntext_2" title="Sorter på Eierform" class="greenbutton" style="height:25px;width:60px;">

2-（xpath）//*[@id=“ctl00_cphAppMain_grdReportSales_ctl01_btntext_2”]（可扩展路径）
我需要从标头中抓取的文本不是代码中的文本，而是“value=”
前两行在第一列中看起来像1-（html）

<td align="left" style="background-color:#d2d2d2;">Barlindveien 21, 3032 DRAMMEN</td>

（xpath）//*[@id=“ctl00_cphAppMain_grdReportSales”]/t正文/tr[2]/td 1
2-（超文本链接）

<td align="left" style="background-color:#d2d2d2;">Lindeveien 22, 3055 KROKSTADELVA</td>

（xpath）//*[@id=“ctl00_cphAppMain_grdReportSales”]/t正文/tr[3]/td 1
我需要在行中抓取的文本是HTML代码中的文本类型。
我尝试使用xpath来抓取它，但我得到的只是空列表。
要帮忙吗？谢谢！

scrapy

来源：https://stackoverflow.com/questions/74346163/scrape-table-using-scrapy-on-python

1条答案

按热度按时间

zqry0prt1#

假设您在问题中共享的xpath是正确的;
对于获取价值的问题：

getValueOfElement = driver.find_element(By.XPATH, '//*[@id="ctl00_cphAppMain_grdReportSales_ctl01_btntext_2"]')
getValueOfElement.get_attribute("value")

对于获取文本的问题：

getTextOfElement = driver.find_element(By.XPATH, '//*[@id="ctl00_cphAppMain_grdReportSales"]/tbody/tr[2]/td1')
getTextOfElement.text

如果要提取多个值，可以使用find_elements而不是find_element，并使用for循环返回列表内部。

赞(0）回复(0）举报 2022-11-09

我来回答

scrapy 一种Python皮刮痧桌

1条答案

相关问题

热门标签

最新问答