即使语法正确，Scrapy回应仍传回'None'

gudnpqoy 于 2022-11-09 发布在其他

关注(0)|答案(1)|浏览(136)

我尝试在字典中取得项目的名称，如下所示：

import scrapy

class TerrorSpider(scrapy.Spider):
    name = 'terror'
    start_urls = ['http://books.toscrape.com/catalogue/category/books/travel_2/index.html']

    def parse(self, response):
        for filme in response.css('h3 a'):
           yield{
            'name': filme.css('h3 a::text').get()
           }

我真的不知道为什么它在“name”字段中返回“None”（它返回代码200）。
我希望获得类似于以下代码的数据：

import scrapy

class ImdbSpider(scrapy.Spider):
    name = 'imdb'
    start_urls = ['https://www.imdb.com/chart/top/?ref_=nv_mv_250']

    def parse(self, response):
        for filmes in response.css('.titleColumn'):
            yield{
                'names' : filmes.css('.titleColumn a::text').get(),
                'years' : filmes.css('.secondaryInfo ::text').get()[1:-1],
                'notes' : response.css('strong ::text').get() 
            }

它工作正常，代码相同。

scrapy

来源：https://stackoverflow.com/questions/74269338/scrapy-response-is-returning-none-even-with-correct-syntax

1条答案

按热度按时间

eit6fx6z1#

您正在尝试为您已经选择的选择器获取选择器...换句话说，您需要做的就是在fileme变量上调用.css('::text').get()。没有必要重复h3和a标记元素，因为您已经在前面的选择器中选择了它们。

import scrapy

class TerrorSpider(scrapy.Spider):
    name = 'terror'
    start_urls = ['http://books.toscrape.com/catalogue/category/books/travel_2/index.html']

    def parse(self, response):
        for filme in response.css('h3 a'):
           yield{
            'name': filme.css('::text').get()
           }

您还可以执行以下操作：

...

def parse(self, response):
    for filme in response.css('h3 a::text').getall():
        yield {'name': fileme}

或者甚至：

def parse(self, response): yield from ({'name': fileme} for fileme in response.css('a h3::text').getall())

赞(0）回复(0）举报 2022-11-09

我来回答

即使语法正确，Scrapy回应仍传回'None'

1条答案

相关问题

热门标签

最新问答