我正在努力改进Scrapy,我面临着查询字符串和变量的新问题。
1)看起来查询字符串需要两个输入(storeInRadiusQuery & cache):Here is the request headers with the API url
2)当我进入Params时,我有2个以JSON格式分组的查询字符串。在这个JSON中,有3个键(operationName、query和variables)。
在其他的Scrapy项目中,查询的格式要容易得多,但在这里我不知道如何用变量来处理这个问题。
我尝试了Formdata Scrapy方法,但没有成功:
data = {
"operationName":"storeInRadiusQuery",
"variables":{"currentLocation":"50.4376478855132,2.82123986359978","service":[],"storeChain":[],"deliveryTypes":[],"date":[],"__typename":"storeLocatorFilters"},
"query":"query storeInRadiusQuery($currentLocation: String!, $service: [String], $storeChain: [String], $deliveryTypes: [String], $date: [String]) {\n viewer {\n storesInRadius(currentLocation: $currentLocation, services: $service, storeChaine: $storeChain, deliveryTypes: $deliveryTypes, date: $date, radius: 20, isStoreLocator: true) {\n source {\n ...StoresMapStoreItemType\n ...StoreLocatorList\n store_location\n sort\n __typename\n }\n __typename\n }\n __typename\n }\n}\n\nfragment StoreLocatorList on StoreItemType {\n store_id\n store_name\n street\n zip_code\n city\n seo_url\n day_0\n day_0_morning_open_time\n day_0_morning_close_time\n day_0_afternoon_open_time\n day_0_afternoon_close_time\n day_1\n day_1_morning_open_time\n day_1_morning_close_time\n day_1_afternoon_open_time\n day_1_afternoon_close_time\n day_2\n day_2_morning_open_time\n day_2_morning_close_time\n day_2_afternoon_open_time\n day_2_afternoon_close_time\n day_3\n day_3_morning_open_time\n day_3_morning_close_time\n day_3_afternoon_open_time\n day_3_afternoon_close_time\n day_4\n day_4_morning_open_time\n day_4_morning_close_time\n day_4_afternoon_open_time\n day_4_afternoon_close_time\n day_5\n day_5_morning_open_time\n day_5_morning_close_time\n day_5_afternoon_open_time\n day_5_afternoon_close_time\n day_6\n day_6_morning_open_time\n day_6_morning_close_time\n day_6_afternoon_open_time\n day_6_afternoon_close_time\n __typename\n}\n\nfragment StoresMapStoreItemType on StoreItemType {\n store_id\n store_name\n store_location\n zip_code\n street\n city\n seo_url\n __typename\n}\n"}
url = "https://www.monoprix.fr/api/graphql?storeInRadiusQuery&cache"
yield scrapy.FormRequest(url,
method='POST',
body=json.dumps(data),
headers={'Content-Type':'application/json'},
callback=self.parse)
我看过this post关于如何处理查询字符串的说明,但是我不知道如何正确地把查询字符串放入字典中。
在这里我想试着修改一下当前位置和半径参数,找到一个店铺列表。
如果你有任何想法...谢谢!
1条答案
按热度按时间pcww981p1#
下面的链接显示了如何正确复制Graphql请求。https://scrapfly.io/blog/web-scraping-graphql-with-python/
要在Scrapy中实现这一点,与上面的链接类似。