使用scrapy-playwright和only playwright的网站工作方式不同

kkbh8khc 于 2022-11-23 发布在其他

关注(0)|答案(1)|浏览(214)

我尝试使用scrapy-playwright登录一个网页，因为我想与scrapy很好地集成.我不能使用scrapy-playwright登录，因为它会重定向到一个不存在的页面.我也尝试过做一个帖子请求，而不是点击，这也不工作.
然而，如果我尝试同样的事情只使用剧作家，它的工作完美...有没有不同的网站打开与scrapy-playwright相比，只有剧作家？有谁知道如何解决这个问题使用scrapy-playwright？
杂剧作家守则：

def start_requests(self):
    yield scrapy.Request(
        url = self.url,
        meta = dict(
            playwright = True,
            playwright_include_page = True,
            playwright_page_methods = [PageMethod('wait_for_selector', 'a[data-toggle=dropdown]')],
                ),
        callback = self.sign_in,
        )

async def sign_in(self, response):
    page = response.meta['playwright_page']
    while await page.is_visible("button[class='close close-news']"):
        await page.click("button[class='close close-news']")
    await page.click('button#declineAllConsentSummary')
    await page.click('div.my-account-sub > a[data-toggle=dropdown]', timeout=10000)
    await page.fill('input#j_username_header', os.getenv(self.usernameKey), timeout=10000)
    await page.fill('input#j_password_header', os.getenv(self.passwordKey), timeout=10000)
    await page.click('button#responsiveMyAccLoginGA')

剧作家代号：

async def test_async_playwright(self):
    async with async_playwright() as playwright:
        browser = await playwright.chromium.launch(headless=False)
        context = await browser.new_context(base_url=self.url)
        page = await context.new_page()
        
        await page.goto(self.url, wait_until='commit')
        while await page.is_visible("button[class='close close-news']"):
            await page.click("button[class='close close-news']")
        await page.click('button#declineAllConsentSummary')
        await page.wait_for_selector('a[data-toggle=dropdown]')
        await page.click('div.my-account-sub > a[data-toggle=dropdown]', timeout=5000)
        await page.fill('input#j_username_header', os.getenv(self.usernameKey), timeout=5000)
        await page.fill('input#j_password_header', os.getenv(self.passwordKey), timeout=5000)
        await page.click('button#responsiveMyAccLoginGA')

scrapy

来源：https://stackoverflow.com/questions/72375388/websites-using-scrapy-playwright-and-only-playwright-work-differently