我将Scrapy作为一个独立脚本运行,如下所示
if __name__ == "__main__":
from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings
s = get_project_settings()
process = CrawlerProcess(s)
process.crawl(MySpider)
process.start()
我的刮刀消耗了巨大的内存,所以我想到了使用这两个自定义设置。
SCHEDULER_DISK_QUEUE = "scrapy.squeue.PickleFifoDiskQueue"
SCHEDULER_MEMORY_QUEUE = "scrapy.squeue.FifoMemoryQueue"
但是在添加了这两个自定义设置后,当我运行我的独立蜘蛛时,我会得到错误消息。
Traceback (most recent call last):
File "/usr/local/lib/python3.9/dist-packages/twisted/internet/defer.py", line 1696, in _inlineCallbacks
result = context.run(gen.send, result)
File "/usr/local/lib/python3.9/dist-packages/scrapy/crawler.py", line 118, in crawl
yield self.engine.open_spider(self.spider, start_requests)
ModuleNotFoundError: No module named 'scrapy.squeue'
你知道这有什么问题吗?
1条答案
按热度按时间7nbnzgx91#
未找到模块错误:没有名为“scrapy.squeue”的模块
您有一个打字错误: