我正试图刮一个js网站与 selenium 。当美丽的汤读什么 selenium 检索我得到一个html页面说:“必须启用Cookie才能查看此页面。”如果有人能帮助我克服这个障碍,我将不胜感激。以下是我的代码:
# import libraries and specify URL
import lxml as lxml
import pandas as pd
from bs4 import BeautifulSoup
import html5lib
from selenium import webdriver
import urllib.request
import csv
url = "https://racing.hkjc.com/racing/information/English/Racing/LocalResults.aspx?RaceDate=2020/06/09"
#new chrome session
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--incognito")
chrome_options.add_argument("--headless")
chrome_options.add_argument("--disable-blink-features=AutomationControlled")
driver = webdriver.Chrome(executable_path= '/Users/susanwhite/PycharmProjects/Horse
Racing/chromedriver', chrome_options=chrome_options)
# Wait for the page to fully load
driver.implicitly_wait(time_to_wait=10)
# Load the web page
driver.get(url)
cookies = driver.get_cookies()
# Parse HTML code and grab tables with Beautiful Soup
soup = BeautifulSoup(driver.page_source, 'html5lib')
print(soup)
2条答案
按热度按时间pgvzfuti1#
尝试删除此行:
chrome_options.add_argument("--incognito")
。没有必要这样做,因为Selenium自然不会保存cookie或来自网站的任何其他信息。r1wp621o2#
删除下面的代码为我解决了这个问题,但无头模式将被禁用,浏览器窗口将可见。
第一个月