带有查询字符串和变量的Scrapy

91zkwejq  于 2022-11-09  发布在  其他
关注(0)|答案(1)|浏览(138)

我正在努力改进Scrapy,我面临着查询字符串和变量的新问题。
1)看起来查询字符串需要两个输入(storeInRadiusQuery & cache):Here is the request headers with the API url
2)当我进入Params时,我有2个以JSON格式分组的查询字符串。在这个JSON中,有3个键(operationName、query和variables)。
在其他的Scrapy项目中,查询的格式要容易得多,但在这里我不知道如何用变量来处理这个问题。
我尝试了Formdata Scrapy方法,但没有成功:

data = {
        "operationName":"storeInRadiusQuery",
        "variables":{"currentLocation":"50.4376478855132,2.82123986359978","service":[],"storeChain":[],"deliveryTypes":[],"date":[],"__typename":"storeLocatorFilters"},
        "query":"query storeInRadiusQuery($currentLocation: String!, $service: [String], $storeChain: [String], $deliveryTypes: [String], $date: [String]) {\n  viewer {\n    storesInRadius(currentLocation: $currentLocation, services: $service, storeChaine: $storeChain, deliveryTypes: $deliveryTypes, date: $date, radius: 20, isStoreLocator: true) {\n      source {\n        ...StoresMapStoreItemType\n        ...StoreLocatorList\n        store_location\n        sort\n        __typename\n      }\n      __typename\n    }\n    __typename\n  }\n}\n\nfragment StoreLocatorList on StoreItemType {\n  store_id\n  store_name\n  street\n  zip_code\n  city\n  seo_url\n  day_0\n  day_0_morning_open_time\n  day_0_morning_close_time\n  day_0_afternoon_open_time\n  day_0_afternoon_close_time\n  day_1\n  day_1_morning_open_time\n  day_1_morning_close_time\n  day_1_afternoon_open_time\n  day_1_afternoon_close_time\n  day_2\n  day_2_morning_open_time\n  day_2_morning_close_time\n  day_2_afternoon_open_time\n  day_2_afternoon_close_time\n  day_3\n  day_3_morning_open_time\n  day_3_morning_close_time\n  day_3_afternoon_open_time\n  day_3_afternoon_close_time\n  day_4\n  day_4_morning_open_time\n  day_4_morning_close_time\n  day_4_afternoon_open_time\n  day_4_afternoon_close_time\n  day_5\n  day_5_morning_open_time\n  day_5_morning_close_time\n  day_5_afternoon_open_time\n  day_5_afternoon_close_time\n  day_6\n  day_6_morning_open_time\n  day_6_morning_close_time\n  day_6_afternoon_open_time\n  day_6_afternoon_close_time\n  __typename\n}\n\nfragment StoresMapStoreItemType on StoreItemType {\n  store_id\n  store_name\n  store_location\n  zip_code\n  street\n  city\n  seo_url\n  __typename\n}\n"}

    url = "https://www.monoprix.fr/api/graphql?storeInRadiusQuery&cache"

    yield scrapy.FormRequest(url,
                                method='POST', 
                                body=json.dumps(data), 
                                headers={'Content-Type':'application/json'},
                                callback=self.parse)

我看过this post关于如何处理查询字符串的说明,但是我不知道如何正确地把查询字符串放入字典中。
在这里我想试着修改一下当前位置和半径参数,找到一个店铺列表。
如果你有任何想法...谢谢!

pcww981p

pcww981p1#

下面的链接显示了如何正确复制Graphql请求。https://scrapfly.io/blog/web-scraping-graphql-with-python/
要在Scrapy中实现这一点,与上面的链接类似。

query = """
       Just copy the query from browser developer tools and paste it here. 
       Remove any newline(\n) and format it properly.
        """

json_data = {
            "query": query,
            'variables': {
                "variable1": abc,
                "variable2": abc,
                "variable2": "abc"
            }
        }

yield scrapy.Request(url=url, method='POST',
                             body=json.dumps(json_data),
                             headers={
                                 'content-type': 'application/json'
                             })

相关问题