无法使用Selenium的多线程发送密钥

wlzqhblo 于 2023-01-17 发布在其他

关注(0)|答案(2)|浏览(122)

我正在尝试使用 selenium 的多线程策略。简而言之，我正在尝试用id填充输入字段。
这是我的剧本：

from concurrent.futures import ThreadPoolExecutor
from selenium.webdriver.common.by import By
import numpy as np
import sys
from selenium import webdriver

def driver_setup():
    path = "geckodriver.exe"
    options = webdriver.FirefoxOptions()
    options.add_argument('--incognito')
    # options.add_argument('--headless')
    driver = webdriver.Firefox(options=options, executable_path=path)
    return driver

def fetcher(id, driver):
    print(id) #this works
    
    # this doesnt work
    driver.get(
        "https://www.roboform.com/filling-test-all-fields")
    driver.find_element(By.XPATH, '//input[@name="30_user_id"]').send_keys(id)
    time.sleep(2)
    print(i, " sent")
    #return data

def crawler(ids):
    for id in ids:
        print(i)
        results = fetcher(id, driver_setup())

drivers = [driver_setup() for _ in range(4)]

ids = list(range(0,50)) # generates ids
print(ids)
chunks = np.array_split(np.array(ids),4) #splits the id list into 4 chunks

with ThreadPoolExecutor(max_workers=4) as executor:
    bucket = executor.map(crawler, chunks)
    #results = [item for block in bucket for item in block]

[driver.quit() for driver in drivers]

除了send_keys方法外，所有的函数都可以正常工作。两个print（）函数都可以正常工作，所以ID好像都被发送到了两个函数。奇怪的是，我没有收到错误消息（我得到了pycharm的进程结束，退出代码为0的通知），所以我不知道我做错了什么。
知道丢了什么吗？
我用了这个例子：https://blog.devgenius.io/multi-threaded-web-scraping-with-selenium-dbcfb0635e83如果有用的话

selenium

来源：https://stackoverflow.com/questions/75127220/cant-send-keys-using-multithreading-with-selenium

2条答案

按热度按时间

2ul0zpep1#

当使用threading时，注意exceptions因为它们被嵌入到futures中。例如，改变你的代码以具有下面的tweaked代码（不要改变任何其它行）

with ThreadPoolExecutor(max_workers=4) as executor:
    bucket = executor.map(crawler, chunks)
    # bucket is list of futures, so let's try to print it
    for e_buck in bucket: # simpleapp add for demo
        print(e_buck) #

你会看到你会得到异常错误，如：

i未定义，请查看Crawler中的print(i, " sent")和print(i)语句。
1.一旦你修正以上的错误，下一个错误将是在发送键的id中-send_keys(id)，id is of type numpy.int64.通过typecast，str（），send_keys(str(id))改变它到str
所以你的代码在修复后会像这样：

from concurrent.futures import ThreadPoolExecutor
from selenium.webdriver.common.by import By
import numpy as np
import sys
from selenium import webdriver
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains as AC
from selenium.webdriver.common.keys import Keys
import time

def driver_setup():
    path = "geckodriver.exe"
    options = webdriver.FirefoxOptions()
    options.add_argument('--incognito')
    # options.add_argument('--headless')
    driver = webdriver.Firefox(options=options, executable_path=path)
    return driver

def fetcher(id, driver):
    print(id) #this works
    
    # this doesnt work - it will work now :)
    driver.get(
        "https://www.roboform.com/filling-test-all-fields")
    driver.find_element(By.XPATH, '//input[@name="30_user_id"]').send_keys(str(id))
    time.sleep(2)
    print(id, " sent")
    #return data

def crawler(ids):
    for id in ids:
        print(id)
        results = fetcher(id, driver_setup())

#drivers = [driver_setup() for _ in range(4)]

ids = list(range(0,50)) # generates ids
print(ids)
chunks = np.array_split(np.array(ids),4) #splits the id list into 4 chunks

with ThreadPoolExecutor(max_workers=4) as executor:
    bucket = executor.map(crawler, chunks)
    # bucket is list of futures, so let's try to print it
    for e_buck in bucket: # simpleapp add for demo
        print(e_buck) # check what print, you get, first time you will get that
        # i is not defined, look at this statment print(i, " sent") and print(i) in crawler. 
        # once you fix the above error, next error will be in id in send keys- send_keys(id), id is of type ''numpy.int64''. change it to str by typecast, str(), send_keys(str(id))
    #results = [item for block in bucket for item in block]

#[driver.quit() for driver in drivers]

赞(0）回复(0）举报 2023-01-17

myzjeezk2#

可能您试图过早调用send_keys()，甚至在<input>字段完全具有rendered之前。
溶液
理想情况下，要向元素发送 * 字符序列 *，需要为element_to_be_clickable()引入WebDriverWait，可以使用以下locator strategies之一：

使用 * 名称 *：

WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.NAME, "30_user_id"))).send_keys(id)

使用 * CSS选择器 *：

WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[name='30_user_id']"))).send_keys(id)

使用 * XPATH *：

WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@name='30_user_id']"))).send_keys(id)

- - 注意**：您必须添加以下导入：

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

赞(0）回复(0）举报 2023-01-17

我来回答

无法使用Selenium的多线程发送密钥

2条答案

相关问题

热门标签

最新问答