scrapy CSS选择器返回空列表

fykwrbwg  于 2022-11-09  发布在  其他
关注(0)|答案(1)|浏览(162)

你好,我是新的scrapy和网页刮一般,我有一个很难的时间,试图从这个网站刮:https://www.webuycars.co.za/buy-a-car
我的目标是刮汽车数据,如名称,价格等从网页上
我从

scrapy shell "https://www.webuycars.co.za/buy-a-car"

然后我做了

fetch("http://localhost:8050/render.html?url=https://www.webuycars.co.za/buy-a-car")

我使用splash与scrapy,因为我已经得出结论,该页面是用javascript创建的,然后我试图发送一些请求,但在页面的html中的某个点后,我开始得到空白(这是我假设是javascript创建的)例如

response.css("div.col-lg-3.col-md-4.col-sm-6.mt-3").getall()
[]
response.css("div.result-item-title").getall() 
[]
response.css("div.result-item-title").get()
response.css(".result-item-title").getall()
[]

得到标题似乎工作,但没有其他我尝试过的作品

response.css("title::text").get()
'WeBuyCars | Sell Cars For Cash | Free Online Vehicle Valuations'

我一直在尝试做这些请求,以确保在我编写蜘蛛程序并将其正确地实现到我的程序中之前得到结果。我在设置文件中设置了我的用户代理。我查看了所有的源文件,以查看是否有包含我所需要的内容的json文件,但没有。我不确定我还能做些什么。我在这个问题上已经纠结了很长一段时间了,我将感激任何帮助。

monwx1rj

monwx1rj1#

您可以从API响应中获取所有数据

import json
import scrapy

class CarsSpider(scrapy.Spider):

    name = 'car'
    body = {"to":24,"size":24,"type":"All","filter_type":"all","subcategory":None,"q":"","Make":None,"Roadworthy":None,"Auctions":[],"Model":None,"Variant":None,"DealerKey":None,"FuelType":None,"BodyType":None,"Gearbox":None,"AxleConfiguration":None,"Colour":None,"FinanceGrade":None,"Priced_Amount_Gte":0,"Priced_Amount_Lte":0,"MonthlyInstallment_Amount_Gte":0,"MonthlyInstallment_Amount_Lte":0,"auctionDate":None,"auctionEndDate":None,"auctionDurationInSeconds":None,"Kilometers_Gte":0,"Kilometers_Lte":0,"Priced_Amount_Sort":"","Bid_Amount_Sort":"","Kilometers_Sort":"","Year_Sort":"","Auction_Date_Sort":"","Auction_Lot_Sort":"","Year":[],"Price_Update_Date_Sort":"","Online_Auction_Date_Sort":"","Online_Auction_In_Progress":""}

    def start_requests(self):
        yield scrapy.Request(
            url='https://website-elastic-api.webuycars.co.za/api/search',
            callback=self.parse,
            body=json.dumps(self.body),
            method="POST")

    def parse(self, response):
        response = json.loads(response.body)

        for resp in response['data']:
            yield {
                'Title': resp['OnlineDescription']
            }

输出:

{'Title': '2022 Citroen C3 Aircross 1.2T Puretech Sine Auto'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2022 Toyota Hilux 2.4 Gd-6 RB Raider Pick Up Double Cab'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2020 Datsun GO 1.2 MID'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2013 Hyundai i10 1.25 Gls/fluid Auto'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2020 Suzuki S-Presso 1.0 GL+ AMT'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2019 SYM Symphony JET 14 200'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2019 Nissan Micra 1.2 Active Visia'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2021 Suzuki Super Carry 1.2i Pick Up Single Cab'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2022 Suzuki AN UB 125 (burgman)'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2022 Honda XRL XR 125l'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2022 Toyota Hilux 2.4 Gd-6 RB Raider Pick Up Double Cab'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2022 Land Rover Defender 110 D300 SE X-Dynamic (221 KW)'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2020 Suzuki S-Presso 1.0 GL'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2022 Big Boy TSR 250'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2022 Hyundai Atos/Atoz 1.1 Motion AMT'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2019 Fiat Panda 900t Lounge'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2017 Chevrolet Spark 1.2 Campus/curve 5-Door'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2020 Crosby Adventure Bike 400cc'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2022 Renault Kwid 1.0 Climber 5-Door'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2019 Suzuki Swift 1.2 GLX'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2022 Volkswagen Polo Classic GP 1.4 Comfortline'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2020 Renault Kwid 1.0 Climber 5-Door Auto'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2022 SYM Crox X-Pro 125'}
2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
{'Title': '2019 Yamaha YZ 450 FX'}

相关问题