Selenium + firefox + geckodriver + python中的“Failed to read marionette port”

q3qa4bjr  于 2023-08-02  发布在  Python
关注(0)|答案(1)|浏览(267)

我正在从php运行一个python脚本,在那里我做一些报废。脚本在Ubuntu终端上运行时运行正常,但在Apache日志中显示了以下行:

df=BUscrap(sys.argv[1])
  File "/var/www/scrapbot.chambeala.com/./realScrapper.py", line 261, in BUscrap
    driver=webdriver.Firefox(service=service, options=options)
  File "/var/www/myenv/lib/python3.10/site-packages/selenium/webdriver/firefox/webdriver.py", line 68, in __init__
    super().__init__(command_executor=executor, options=self.options)
  File "/var/www/myenv/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 206, in __init__
    self.start_session(capabilities)
  File "/var/www/myenv/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 291, in start_session
    response = self.execute(Command.NEW_SESSION, caps)["value"]
  File "/var/www/myenv/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 346, in execute
    self.error_handler.check_response(response)
  File "/var/www/myenv/lib/python3.10/site-packages/selenium/webdriver/remote/errorhandler.py", line 245, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message: Failed to read marionette port

字符串
这是我在Python中设置和启动Geckodriver的部分:

def BUscrap(limit=""):
    limit=int(limit)
    print("Capturando nuevos trabajos...")
    options=webdriver.FirefoxOptions()
    service = Service(executable_path='/usr/local/bin/geckodriver')
    options.add_argument('--headless')
    options.add_argument('--no-sandbox')
    options.add_argument('--disable-dev-shm-usage')
    options.add_argument(f'--proxy-server={proxy_server_url}')
    options.add_argument('user-agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.50 Safari/537.36')
    driver=webdriver.Firefox(service=service, options=options)
    driver.implicitly_wait(10)


这是PHP的调用:

<?php
$cmd = escapeshellcmd("./realScrapper.py " .$_POST["ligas"]);
if(isset($_POST['ejecutar'])) {
    echo '<div class="output">';
    while (@ ob_end_flush());
    $proc = popen($cmd, 'r');
    echo '<pre>';
    while (!feof($proc))
    {
        echo fread($proc, 4096);
        @ flush();
    }
    echo '</pre>';
    $_POST = array();
    echo '</div>';
}
?>


Firefox和Geckodriver有执行权限,PHP代码可以与其他脚本一起正常工作。Firefox不是通过snap安装的,而是从debian软件包中安装的。
Python版本为3.10.6,Selenium版本为4.10.0,Firefox版本为107.0.1,Geckodriver版本为0.33.0
如果需要更多信息,请告诉我。
我试图从PHP中执行一个Python scrapping脚本,但是网页变得空闲,然后停止了。

mpgws1up

mpgws1up1#

以下参数是ChromeDriver特定的参数,不适用于GeckoDriver,因此您可以删除它们:

options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
options.add_argument(f'--proxy-server={proxy_server_url}')

字符串
另外替换:

options.add_argument('--headless')


与:

options.add_argument('--headless=new')


作为DeprecationWarning: headless property is deprecated, instead use add_argument('--headless') or add_argument('--headless=new') on Selenium 4.8.0 Python并执行测试。

相关问题