python 使用xpath会给出空输出

c3frrgcw 于 2023-01-24 发布在 Python

关注(0)|答案(2)|浏览(636)

我想得到地址，但他们给我空的，我在XPath中做错了什么...这些是页面链接https://www.findtruckservice.com/page/cummins-sales-and-service-farmington-nm-430653
地址的快照：

代码试验：

import scrapy
from scrapy import Selector
from scrapy_selenium import SeleniumRequest
from scrapy.http import Request

class TestSpider(scrapy.Spider):
    name = 'test'

    
    
    def start_requests(self):
            yield SeleniumRequest(
                url ="https://www.findtruckservice.com/search/?city=Florida%2C+CO&mainCat=1&subCat=Truck+Repair&lat=37.0731&lon=-106.247&cat_field=Mobile+Repair+-+Truck+Repair",
                wait_time = 3,
                screenshot = True,
                callback = self.parse,
                dont_filter = True
                )
    
    def parse(self, response):
            books = response.xpath("//h3//a//@href").extract()
            for book in books:
                url = response.urljoin(book)
                yield Request(url, callback=self.parse_book)
            
                    
    def parse_book(self, response):
            address=response.xpath("//div[1][@class='threecol align_left card']//div//text()").get()
            yield{
                'address':address
            }

python

来源：https://stackoverflow.com/questions/75214389/using-xpath-gives-empty-output

2条答案

按热度按时间

vaj7vani1#

要从website打印所需的文本，需要为visibility_of_element_located()引入WebDriverWait，可以使用以下locator strategies之一：

使用 * XPATH * 和 * text * 属性：

driver.get("https://www.findtruckservice.com/page/cummins-sales-and-service-farmington-nm-430653")
print(WebDriverWait(driver, 5).until(EC.visibility_of_element_located((By.XPATH, "//h4[@class='sec-title' and text()='CONTACT']//following::div[@class='container']"))).text)

使用 * XPATH * 和get_attribute("textContent")：

driver.get("https://www.findtruckservice.com/page/cummins-sales-and-service-farmington-nm-430653")
print(WebDriverWait(driver, 5).until(EC.visibility_of_element_located((By.XPATH, "//h4[@class='sec-title' and text()='CONTACT']//following::div[@class='container']"))).get_attribute("textContent"))

- - 注意**：您必须添加以下导入：

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

控制台输出：

Cummins Sales and Service
1101 N Troy King Rd
Farmington, NM
505-327-7331 (primary)
505-326-2948 (fax)

参考文献

有用文档链接：

方法
text属性返回The text of the element.
Difference between text and innerHTML using Selenium

赞(0）回复(0）举报 2023-01-24

az31mfrm2#

请尝试以下操作：

[...]

address = ' '.join([x.strip() for x in response.xpath("//div[@class='threecol align_left card'][1]/div[@class='container']/text()").extract()])

赞(0）回复(0）举报 2023-01-24

我来回答

python 使用xpath会给出空输出

2条答案

参考文献

相关问题

热门标签

最新问答