我尝试使用scrapy-playwright登录一个网页,因为我想与scrapy很好地集成.我不能使用scrapy-playwright登录,因为它会重定向到一个不存在的页面.我也尝试过做一个帖子请求,而不是点击,这也不工作.
然而,如果我尝试同样的事情只使用剧作家,它的工作完美...有没有不同的网站打开与scrapy-playwright相比,只有剧作家?有谁知道如何解决这个问题使用scrapy-playwright?
杂剧作家守则:
def start_requests(self):
yield scrapy.Request(
url = self.url,
meta = dict(
playwright = True,
playwright_include_page = True,
playwright_page_methods = [PageMethod('wait_for_selector', 'a[data-toggle=dropdown]')],
),
callback = self.sign_in,
)
async def sign_in(self, response):
page = response.meta['playwright_page']
while await page.is_visible("button[class='close close-news']"):
await page.click("button[class='close close-news']")
await page.click('button#declineAllConsentSummary')
await page.click('div.my-account-sub > a[data-toggle=dropdown]', timeout=10000)
await page.fill('input#j_username_header', os.getenv(self.usernameKey), timeout=10000)
await page.fill('input#j_password_header', os.getenv(self.passwordKey), timeout=10000)
await page.click('button#responsiveMyAccLoginGA')
剧作家代号:
async def test_async_playwright(self):
async with async_playwright() as playwright:
browser = await playwright.chromium.launch(headless=False)
context = await browser.new_context(base_url=self.url)
page = await context.new_page()
await page.goto(self.url, wait_until='commit')
while await page.is_visible("button[class='close close-news']"):
await page.click("button[class='close close-news']")
await page.click('button#declineAllConsentSummary')
await page.wait_for_selector('a[data-toggle=dropdown]')
await page.click('div.my-account-sub > a[data-toggle=dropdown]', timeout=5000)
await page.fill('input#j_username_header', os.getenv(self.usernameKey), timeout=5000)
await page.fill('input#j_password_header', os.getenv(self.passwordKey), timeout=5000)
await page.click('button#responsiveMyAccLoginGA')
1条答案
按热度按时间kqlmhetl1#
作为一种可能的解决方法,如果在令牌/cookie被授予后您被重定向(到损坏的页面),您也可以导航到正常的站点url,并且您应该登录