我创建了一个Python Selify脚本来过滤掉包含特定元素的URL。几乎一切都运行得很好。但我不断地收到一些未解决的异常,在这些异常之后,我的脚本停止:
selenium.common.exceptions.WebDriverException: Message: unknown error: net::ERR_NAME_NOT_RESOLVED
和
selenium.common.exceptions.WebDriverException: Message: unknown error: net::ERR_CONNECTION_TIMED_OUT
我试着创造循环。循环适用于其他异常,如NoSuchElement异常或任何其他异常,但问题出在WebDriverException上。我无法解决这个问题。我还在循环中添加了“Continue”,但也失败了。我正在从CSV文件中读取URL列表。
以下是我的代码:
from logging import exception
from time import sleep
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.common.exceptions import NoSuchElementException
from selenium.common.exceptions import WebDriverException
import time
from fake_useragent import UserAgent
import csv
import selenium.common.exceptions
options = Options()
ua = UserAgent()
userAgent = ua.random
options.add_argument(f'user-agent={userAgent}')
options.add_argument("--headless")
def csv_url_reader(url_obj):
reader = csv.DictReader(url_obj, delimiter=',')
for line in reader:
rawUrls = line["URL"]
print(rawUrls)
chromedriver = ("chromedriver")
driver = webdriver.Chrome(chromedriver, options=options)
driver.set_window_size(1920, 1080)
driver.get(rawUrls)
driver.execute_script("window.scrollTo(0,document.body.scrollHeight)")
try:
name = driver.find_element(By.ID, 'author')
email = driver.find_element(By.ID, 'email')
print("PASSED, ALL REQUIRED ELEMENTS FOUND")
filterAll = driver.current_url
with open("HAS_ALL_ELEMENTS.txt", "a") as r:
print(filterAll, file=r)
except WebDriverException or NoSuchElementException or Exception:
#print('Exception:',exception)
print("NONE OF THE ELEMENTS FOUND, ERROR!")
nothingFound = driver.current_url
with open("NO_ELEMENTS.txt", "a") as n:
print(nothingFound, file=n)
continue
if __name__ == "__main__":
with open ("RAW_URLs.csv") as url_obj:
reader = csv.reader(url_obj)
csv_url_reader(url_obj)
事实上,我想创造一个势不可挡的剧本。如果有任何例外,它应该跳过该URL并切换到另一个URL。剧本应该继续下去。我尝试了许多在Stack Overflow上可用的解决方案,但没有一个对我有效。
1条答案
按热度按时间mnemlml81#
我猜WebDriverException来自
driver.get(rawUrls)
行,将该行移到try except
块中。