selenium 将多线程Selify脚本部署到Heroku

sshcrbum  于 2022-11-10  发布在  其他
关注(0)|答案(1)|浏览(121)

我有一个多线程的Selify脚本,如下所示:

def price_scrape():
    options1 = webdriver.ChromeOptions()
    options1.add_experimental_option('excludeSwitches', ['enable-logging'])
    options1.add_argument("window-size=1920x1480")
    options1.add_argument("disable-dev-shm-usage")
    options1.add_argument("--headless")
    driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options1)

     #Something to scrape

def price_scrape_2():
    options1 = webdriver.ChromeOptions()
    options1.add_experimental_option('excludeSwitches', ['enable-logging'])
    options1.add_argument("window-size=1920x1480")
    options1.add_argument("disable-dev-shm-usage")
    options1.add_argument("--headless")
    driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options1)

        #somethin else to scrape

def price_scrape_3():
    options1 = webdriver.ChromeOptions()
    options1.add_experimental_option('excludeSwitches', ['enable-logging'])
    options1.add_argument("window-size=1920x1480")
    options1.add_argument("disable-dev-shm-usage")
    options1.add_argument("--headless")
    driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options1)

        #somethin else to scrape

if __name__ =="__main__":
        t1 = threading.Thread(target=price_scrape)
        t2 = threading.Thread(target=price_scrape_2)
        t3 = threading.Thread(target=price_scrape_3)
        t1.start()
        t2.start()
        t3.start()

        t1.join()
        t2.join()
        t3.join()

我的问题是,当我部署到Heroku并运行heroku logs --tail时,我收到的错误消息如下:

2022-11-02T02:57:41.047999+00:00 app[worker.1]: Traceback (most recent call last):
2022-11-02T02:57:41.048043+00:00 app[worker.1]: Thread-1 (price_scrape)  File "/app/.heroku/python/lib/python3.10/site-packages/webdriver_manager/core/archive.py", line 39, in __extract_zip
2022-11-02T02:57:41.048054+00:00 app[worker.1]: :
2022-11-02T02:57:41.048073+00:00 app[worker.1]: Traceback (most recent call last):
2022-11-02T02:57:41.048110+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/site-packages/webdriver_manager/core/archive.py", line 39, in __extract_zip
2022-11-02T02:57:41.048232+00:00 app[worker.1]: archive.extractall(to_directory)
2022-11-02T02:57:41.048256+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/zipfile.py", line 1645, in extractall
2022-11-02T02:57:41.048661+00:00 app[worker.1]: archive.extractall(to_directory)
2022-11-02T02:57:41.048713+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/zipfile.py", line 1645, in extractall
2022-11-02T02:57:41.048880+00:00 app[worker.1]: self._extract_member(zipinfo, path, pwd)
2022-11-02T02:57:41.048964+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/zipfile.py", line 1700, in _extract_member
2022-11-02T02:57:41.049794+00:00 app[worker.1]: shutil.copyfileobj(source, target)
2022-11-02T02:57:41.049808+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/shutil.py", line 195, in copyfileobj
2022-11-02T02:57:41.049846+00:00 app[worker.1]: self._extract_member(zipinfo, path, pwd)
2022-11-02T02:57:41.049960+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/zipfile.py", line 1700, in _extract_member
2022-11-02T02:57:41.049965+00:00 app[worker.1]: buf = fsrc_read(length)
2022-11-02T02:57:41.049967+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/zipfile.py", line 925, in read
2022-11-02T02:57:41.050291+00:00 app[worker.1]: data = self._read1(n)
2022-11-02T02:57:41.050294+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/zipfile.py", line 993, in _read1
2022-11-02T02:57:41.050918+00:00 app[worker.1]: data += self._read2(n - len(data))
2022-11-02T02:57:41.051014+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/zipfile.py", line 1028, in _read2
2022-11-02T02:57:41.051474+00:00 app[worker.1]: shutil.copyfileobj(source, target)    raise EOFError
2022-11-02T02:57:41.051510+00:00 app[worker.1]: EOFError
2022-11-02T02:57:41.051512+00:00 app[worker.1]:
2022-11-02T02:57:41.051514+00:00 app[worker.1]:
2022-11-02T02:57:41.051515+00:00 app[worker.1]: During handling of the above exception, another exception occurred:
2022-11-02T02:57:41.051515+00:00 app[worker.1]:
2022-11-02T02:57:41.051519+00:00 app[worker.1]: Traceback (most recent call last):
2022-11-02T02:57:41.051530+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
2022-11-02T02:57:41.051559+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/shutil.py", line 195, in copyfileobj
2022-11-02T02:57:41.052007+00:00 app[worker.1]: buf = fsrc_read(length)
2022-11-02T02:57:41.052026+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/zipfile.py", line 925, in read
2022-11-02T02:57:41.052230+00:00 app[worker.1]: self.run()
2022-11-02T02:57:41.052279+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/threading.py", line 953, in run
2022-11-02T02:57:41.052390+00:00 app[worker.1]: data = self._read1(n)
2022-11-02T02:57:41.052401+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/zipfile.py", line 993, in _read1
2022-11-02T02:57:41.052971+00:00 app[worker.1]: self._target(*self._args,**self._kwargs)
2022-11-02T02:57:41.052995+00:00 app[worker.1]: File "/app/recycle.py", line 152, in price_scrape_2
2022-11-02T02:57:41.053040+00:00 app[worker.1]: data += self._read2(n - len(data))
2022-11-02T02:57:41.053043+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/zipfile.py", line 1028, in _read2
2022-11-02T02:57:41.053261+00:00 app[worker.1]: driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options1)
2022-11-02T02:57:41.053429+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/site-packages/webdriver_manager/chrome.py", line 39, in install
2022-11-02T02:57:41.053644+00:00 app[worker.1]: driver_path = self._get_driver_path(self.driver)
2022-11-02T02:57:41.053653+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/site-packages/webdriver_manager/core/manager.py", line 31, in _get_driver_path
2022-11-02T02:57:41.054487+00:00 app[worker.1]: raise EOFError
2022-11-02T02:57:41.054487+00:00 app[worker.1]: EOFError
2022-11-02T02:57:41.054488+00:00 app[worker.1]:
2022-11-02T02:57:41.054488+00:00 app[worker.1]: During handling of the above exception, another exception occurred:
2022-11-02T02:57:41.054489+00:00 app[worker.1]:
2022-11-02T02:57:41.054489+00:00 app[worker.1]: Traceback (most recent call last):
2022-11-02T02:57:41.054490+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
2022-11-02T02:57:41.054490+00:00 app[worker.1]: binary_path = self.driver_cache.save_file_to_cache(driver, file)
2022-11-02T02:57:41.054491+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/site-packages/webdriver_manager/core/driver_cache.py", line 46, in save_file_to_cache   
2022-11-02T02:57:41.054491+00:00 app[worker.1]: files = archive.unpack(path)
2022-11-02T02:57:41.054491+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/site-packages/webdriver_manager/core/archive.py", line 30, in unpack
2022-11-02T02:57:41.054492+00:00 app[worker.1]: self.run()
2022-11-02T02:57:41.054492+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/threading.py", line 953, in run
2022-11-02T02:57:41.054612+00:00 app[worker.1]: self._target(*self._args,**self._kwargs)
2022-11-02T02:57:41.054623+00:00 app[worker.1]: File "/app/recycle.py", line 35, in price_scrape
2022-11-02T02:57:41.054841+00:00 app[worker.1]: return self.__extract_zip(directory)driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options1)2022-11-02T02:57:41.054841+00:00 app[worker.1]:
2022-11-02T02:57:41.054859+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/site-packages/webdriver_manager/core/archive.py", line 41, in __extract_zip
2022-11-02T02:57:41.054867+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/site-packages/webdriver_manager/chrome.py", line 39, in install
2022-11-02T02:57:41.054992+00:00 app[worker.1]: if e.args[0] not in [26, 13] and e.args[1] not in [
2022-11-02T02:57:41.055115+00:00 app[worker.1]: IndexError: tuple index out of range
2022-11-02T02:57:41.055982+00:00 app[worker.1]: driver_path = self._get_driver_path(self.driver)
2022-11-02T02:57:41.056133+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/site-packages/webdriver_manager/core/manager.py", line 31, in _get_driver_path
2022-11-02T02:57:41.056431+00:00 app[worker.1]: binary_path = self.driver_cache.save_file_to_cache(driver, file)
2022-11-02T02:57:41.056471+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/site-packages/webdriver_manager/core/driver_cache.py", line 46, in save_file_to_cache   
2022-11-02T02:57:41.060561+00:00 app[worker.1]: files = archive.unpack(path)
2022-11-02T02:57:41.060586+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/site-packages/webdriver_manager/core/archive.py", line 30, in unpack
2022-11-02T02:57:41.062080+00:00 app[worker.1]: return self.__extract_zip(directory)
2022-11-02T02:57:41.062082+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.10/site-packages/webdriver_manager/core/archive.py", line 41, in __extract_zip
2022-11-02T02:57:41.062198+00:00 app[worker.1]: if e.args[0] not in [26, 13] and e.args[1] not in [
2022-11-02T02:57:41.062221+00:00 app[worker.1]: IndexError: tuple index out of range

我不知道这是什么意思,所以我不能真正解决这个问题。
我的脚本在本地运行得很好,但它总是给我这个错误。我刚刚开始使用Heroku,所以这一切对我来说都是新的。请帮帮我。

3pmvbmvn

3pmvbmvn1#

Search This:IndexError:元组索引超出范围

相关问题