我尝试使用scrapy和剧作家刮动态网页,我安装了scrapy和剧作家,但是,当我试图运行我的蜘蛛,我得到这个错误。ImportError: cannot import name 'PageCoroutine' from 'scrapy_playwright.page' (C:\Ali\DataCamp\Web Scraping in Python\Scrapy\venv\lib\site-packages\scrapy_playwright\page.py)
这是我的代码(这是一个测试代码):
import scrapy
from scrapy_playwright.page import PageCoroutine
class PwspiderSpider(scrapy.Spider):
name = 'pwspider'
def start_requests(self):
yield scrapy.Request("https://shoppable-campaign-demo.netlify.app/#/", meta=dict(playwright=True, playwright_include_page=True, playwright_page_coroutine=[PageCoroutine('wait_for_selector', 'div#productListing')]))
async def parse(self, response):
yield {'text': response.text}
我甚至在设置文件中添加了DOWNLOAD_HANDLERS和TWISTED_REACTOR。
1条答案
按热度按时间qq24tv8q1#
PageCoroutine
已过时。请改用playwright_page_methods
。工作代码示例:
输出: