Scrapy：从没有id的span中提取文本

cpjpxq1n 于 2022-10-22 发布在 Python

关注(0)|答案(1)|浏览(159)

我正在为这个网站做一个蜘蛛：link。我正在尝试获取价格，但我没能这样做。我一个也拿不回来，我可以获得标题，但我无法获得价格，因为跨度没有分类，我不知道为什么没有得到，因为在浏览器中xpath可以工作。

imgCss = response.xpath("(//img[contains(@class, 'vtex-product-summary-2-x-imageNormal')]/@src)[2]").get()
title = response.xpath("(//article)[3]//span[contains(@class, 'vtex-product-summary-2-x-productBrand')]/text()").get()
discount = response.xpath("(//article)[3]//span[contains(@class, 'currencyContainer--summary txt-price-responsive')]//text()").get()
price = response.xpath("(//article)[3]//span[contains(@class, 'currencyContainer--summary t-heading-2-s')]//text()").get()

响应图像：

小时

python

来源：https://stackoverflow.com/questions/74158510/scrapy-extract-text-from-span-without-id

1条答案

按热度按时间

nvbavucw1#

您查找的信息可在页面源中找到。您可以将其解析为json并提取所需的信息。您可以从下面的代码开始，并从字典中提取相关信息。

import scrapy
import json

class TestSpider(scrapy.Spider):
    name = 'test'
    allowed_domains = ['elektra.mx']

    def start_requests(self):
        yield scrapy.Request(url="https://www.elektra.mx/telefonia/celulares")

    def parse(self, response):
        data = response.xpath("(//*[@type='application/ld+json'])[2]/text()").get()
        data_json = json.loads(data)

        for product in data_json.get("itemListElement"):
            yield product.get("item")

赞(0）回复(0）举报 2022-10-22

我来回答

Scrapy：从没有id的span中提取文本

1条答案

相关问题

热门标签

最新问答