我想从网站https://www.startech.com.bd刮所有的监视项目.但问题出现时,我运行我的蜘蛛它只返回60结果.这里是我的代码,它不工作的权利:
import scrapy
import time
class StartechSpider(scrapy.Spider):
name = 'startech'
allowed_domains = ['startech.com.bd']
start_urls = ['https://www.startech.com.bd/monitor/']
def parse(self, response):
monitors = response.xpath("//div[@class='p-item']")
for monitor in monitors:
item = monitor.xpath(".//h4[@class = 'p-item-name']/a/text()").get()
price = monitor.xpath(".//div[@class = 'p-item-price']/span/text()").get()
yield{
'item' : item,
'price' : price
}
next_page = response.xpath("//ul[@class = 'pagination']/li/a/@href").get()
print (next_page)
if next_page:
yield response.follow(next_page, callback = self.parse)
任何帮助都是非常感谢的!
1条答案
按热度按时间67up9zun1#
//ul[@class = 'pagination']/li/a/@href
一次选择10个项目/页,但您必须仅选择下一页的唯一含义。以下xpath表达式获取正确的分页。编码:
输出: