有没有任何东西可以为从scrapy导出的json设置一个自定义格式结构?如果有,那么如何设置?

vsikbqxv  于 2022-11-09  发布在  其他
关注(0)|答案(1)|浏览(111)

我对python和scrapy的东西和scrapy的文档完全不友好😢😢。我为我的学校项目做了一个spider,它成功地抓取了我想要的数据,但是问题是json导出中的格式。这只是我的代码看起来的一个模拟;

def parse_links(self, response):
    products  = response.css('qwerty')
    for product in products:
        yield {
            'Title' : response.xpath('/html/head/title/text()').get()
            'URL' : response.url,
            'Product' : response.css('product').getall(),
            'Manufacturer' : response.xpath('Manufacturer').getall(),
            'Description' : response.xpath('Description').getall(),
            'Rating' : response.css('rating').getall(),
            }.

json中的export看起来像这样;
[{"Title": "x", "URL": "https://y.com", "Product": ["a", "e"], "Manufacturer": ["b", "f"], "Description": ["c", "g"], "Rating": ["d", "h"]}] .
To be precise this is how it looks now.
但我希望数据以这种格式导出;
[{"Products": [{"Title":"x","URL":"https://y.com", "Links":[{"Product":"a","Manufacturer":"b","Description":"c","Rating":"d"},{"Product":"e","Manufacturer":"f","Description":"g","Rating":"h"}]}]}]
This is how I want the data.
我尝试了一些东西从网络上,但没有工作,我找不到任何解释性的文件在Scrapy网站。提供的是不容易理解的人像我一样,我前面说。所以任何帮助将是伟大的我。我做了刮刀很容易,但已经卡住了一天。FYI我没有使用任何自定义管道和项目。
提前谢谢你,祝你有美好的一天。

mgdq6dx1

mgdq6dx11#

试试这个json解析

def parse_links(self, response):
products  = response.css('qwerty')
for product in products:
    AllResopnse = []
    Links = []
    Links.append({"Product":response.css('product').getall(),"Manufacturer":response.xpath('Manufacturer').getall(),"Description":response.xpath('Description').getall(),"Rating":response.css('rating').getall()})
    TitleDict = {"Title":response.xpath('/html/head/title/text()').get(),"URL":"https://y.com","Links":Links}
    ResponseData = {"Products":[TitleDict]}
yield ResponseData

相关问题