使用scrapy提取id

lc8prwob  于 2023-08-05  发布在  其他
关注(0)|答案(1)|浏览(122)

我想提取id,但他们不会给予我任何你建议我的解决方案


的数据

import scrapy
from scrapy.http import Request
from selenium import webdriver
from scrapy.http import HtmlResponse
import time
from scrapy_selenium import SeleniumRequest
from scrapy.linkextractors import LinkExtractor
from scrapy.spiders import CrawlSpider, Rule

class BarSpider(scrapy.Spider):
    name = 'bar'
      
    def start_requests(self):
        keywords=['Alsace']
        for keyword in keywords:
            amaz_url=f"https://www.pagesjaunes.fr/annuaire/region/{keyword}/reparateur-electromenager"
            yield scrapy.Request(url=amaz_url,callback=self.parse,meta={'keyword':keyword})

    def parse(self,response):
        keyword=response.meta['keyword']
        for link in response.xpath("//li[@class='bi bi-generic']//@id"):
            yield{
                'id':link
                }

字符串

wpcxdonn

wpcxdonn1#

你正在迭代选择器而不是值。
换这条线。

for link in response.xpath("//li[@class='bi bi-generic']//@id"):

字符串

for link in response.xpath("//li[@class='bi bi-generic']//@id").getall():


来源:Scrapy文档

相关问题