我想得到地址,但他们给我空的,我在XPath中做错了什么...这些是页面链接https://www.findtruckservice.com/page/cummins-sales-and-service-farmington-nm-430653
地址的快照:
代码试验:
import scrapy
from scrapy import Selector
from scrapy_selenium import SeleniumRequest
from scrapy.http import Request
class TestSpider(scrapy.Spider):
name = 'test'
def start_requests(self):
yield SeleniumRequest(
url ="https://www.findtruckservice.com/search/?city=Florida%2C+CO&mainCat=1&subCat=Truck+Repair&lat=37.0731&lon=-106.247&cat_field=Mobile+Repair+-+Truck+Repair",
wait_time = 3,
screenshot = True,
callback = self.parse,
dont_filter = True
)
def parse(self, response):
books = response.xpath("//h3//a//@href").extract()
for book in books:
url = response.urljoin(book)
yield Request(url, callback=self.parse_book)
def parse_book(self, response):
address=response.xpath("//div[1][@class='threecol align_left card']//div//text()").get()
yield{
'address':address
}
2条答案
按热度按时间vaj7vani1#
要从website打印所需的文本,需要为visibility_of_element_located()引入WebDriverWait,可以使用以下locator strategies之一:
get_attribute("textContent")
:参考文献
有用文档链接:
text
属性返回The text of the element.
az31mfrm2#
请尝试以下操作: