我尝试在python中使用Scrapy刮取多个作者名,但由于内部div和每个作者的css_class的更改,我遇到错误,我遇到此错误,AttributeError: 'SelectorList' object has no attribute 'response'
class NewsSpider(scrapy.Spider):
name = "travelandleisure"
def start_requests(self):
url = input("Enter the article url: ")
yield scrapy.Request(url, callback=self.parse_dir_contents)
def parse_dir_contents(self, response):
try:
Authoro = response.css('div.comp mntl-bylines__group--author mntl-bylines__group mntl-block')
Author = []
for item in Authoro.response.css('div.comp mntl-bylines__item mntl-attribution__item::text'):
Authoro.append(item)
for item in Authoro.response.css('div.comp mntl-bylines__item mntl-attribution__item mntl-attribution__item--has-date::text'):
Authoro.append(item)
except IndexError:
Author = "NULL"
yield{
'Category':Category,
'Headlines':Headlines,
'Author': Author,
}
这里是link of site
,看authors
的HTML code
,https://www.travelandleisure.com/travel-news/where-can-americans-travel-right-now-a-country-by-country-guide
1条答案
按热度按时间mum43rcc1#
这是从该页面获取作者的一种方法:
粗糙的文档:https://docs.scrapy.org/en/latest/