用scrapy抓取json数据

x4shl7ld  于 2022-11-09  发布在  其他
关注(0)|答案(1)|浏览(199)

我试图刮下面的网站,我已经成功地达到了,直到生成的机构。我想知道如何访问其他细节,如名称,评级,标题,说明。下面是代码。我想弄清楚如何访问的关键字,如名称,评级,审查的响应
密码:

import scrapy
import json
from pprint import pprint

class nykacr(scrapy.Spider):
    name = 'nykaa'
    allowed_domains=['nykaa.com']
    start_urls = ["https://www.nykaa.com/gateway-api/products/683166/reviews?pageNo=1&filters=DEFAULT&domain=nykaa"]

    def parse(self,response):
        datas = json.loads(response.body)
z9smfwbn

z9smfwbn1#

您只需要获取reviewData字段并像列表一样迭代它:
例如:

import scrapy

class nykacr(scrapy.Spider):
    name = 'nykaa'
    allowed_domains=['nykaa.com']
    start_urls = ["https://www.nykaa.com/gateway-api/products/683166/reviews?pageNo=1&filters=DEFAULT&domain=nykaa"]

    def parse(self,response):
        for item in response.json()["response"]["reviewData"]:
            yield {
                "id": item["id"],
                "childId": item["childId"],
                "title": item["title"],
                "description": item["description"],
                "name": item["name"],
                "createdOn": item["createdOn"],
                "reviewCreationText": item["reviewCreationText"],
                "likeCount": item["likeCount"],
                "rating": item["rating"],
                "isLikedByUser": item["isLikedByUser"],
                "isBuyer": item["isBuyer"],
            }

相关问题