Chrome 我的Python代码，以网页抓取和下载五个图像是不工作,使用Blender(3D)作为IDE

58wvjzkj 于 2023-09-28 发布在 Go

关注(0)|答案(1)|浏览(98)

我正在运行此代码太网页抓取和下载5从谷歌的图像：应该发生的是，在我运行代码后，Chrome浏览器应该出现，代码应该导致鼠标点击图像，然后下载它，然后向下滚动到另一个图像，鼠标点击并下载它，等等，最多五次。在这段代码中，Chrome浏览器出现了一秒钟，关闭了，没有其他事情发生......尽管代码没有抛出任何Python错误。
我使用Blender 3D建模软件作为我的IDE，因为我希望在未来使用Python代码制作Blender插件（Blender插件有点像Google Chrome中的扩展，它是一个小软件，您可以安装到Blender中以增加其功能）。这就是为什么在我的代码顶部有额外的导入行的原因……
一个相关的项目是，我得到这个警告：
E：\GLOBAL ASSETS\SCRIPTING\Web Scraping Images\web-scraper.blend\web-scraper.py：21：弃用警告：executable_path已过时，请传入服务对象
这是在我运行代码后，控制台中唯一的其他信息：
DevTools在ws：//www.example.com上侦听127.0.0.1:52643/devtools/browser/ea448f70-0066-4d50-bfb8-8671528789b8
任何帮助都将不胜感激

import bpy
import subprocess
import sys
import os
import cv2
import random
from random import randrange
from PIL import Image #make sure both pil from c:\users\mjoe6\appdata\local\packages\pythonsoftwarefoundation.python.3.10_qbz5n2kfra8p0\localcache\local-packages\python310\site-packages is in blender pip3.exe folder
from selenium import webdriver
from selenium.webdriver.common.by import By
import requests
import io
import time

# path to python.exe
python_exe = os.path.join(sys.prefix, 'bin', 'python.exe')
py_lib = os.path.join(sys.prefix, 'lib', 'site-packages','pip')

PATH = "E:\\GLOBAL ASSETS\\SCRIPTING\\Web Scraping Images\\chromedriver.exe"
wd = webdriver.Chrome(PATH)

def get_images_from_google(wd, delay, max_images):
    def scroll_down(wd):
        wd.execute_script("window.scrollTo(0, document.body.scrollHeight);")
        time.sleep(delay)
        
        url = "https://www.google.com/search?q=cats+2019+IMDb&sxsrf=ALiCzsZmBIp-JZmZv23v6ORoc0VL2NRuxg:1654543304286&source=lnms&tbm=isch&sa=X&ved=2ahUKEwiRjquPxpn4AhUymo4IHbC6Cx0Q_AUoAnoECAEQBA&biw=1536&bih=714&dpr=1.25#imgrc=ixH-uoDQFN_gpM"
        wd.get(url)
        
        image_urls = set()
        while len (image_urls) < max_images:
            scroll_down(wd)
            thumbnails = wd.find_elements(By.CLASS_NAME, "Q4LuWd")
            for img in thumbnails[len(image_urls):max_images]:
                try:
                    img.click()
                    time.sleep.delay
                except:
                    continue    
                images = wd.find_elements(By.CLASS_NAME, "n3VNCb")
                for image in images:
                    if image.get_attribute('src') and 'http' in image.get_attribute('src'):
                        image_urls.add(image.get_attribute('src'))
                        print(f"Found {len (image_urls)}")
        return image_urls
        
def download_image(download_path, url, file_name):
    try:
        image_content = requests.get(url).content
        image_file = io.BytesIO(image_content)
        image = Image.open(image_file)
        file_path = download_path + file_name
        with open(file_path, "wb") as f:
            image.save(f, "PNG")
        
        print("Success")
    except Exception as e:
        print('FAILED -', e)    

urls = get_images_from_google(wd, 1, 5)
print(urls)
wd.quit()

google-chrome

来源：https://stackoverflow.com/questions/72536540/my-python-code-to-web-scrape-and-download-five-images-is-not-working-using-ble

1条答案

按热度按时间

hc2pp10m1#

错误可能在这些行中，

images = wd.find_elements(By.CLASS_NAME, "n3VNCb")

应该像

images = wd.find_elements(By.CLASS_NAME, "iPVvYb")

也在这一行

if image.get_attribute('src') and 'http' in image.get_attribute('src'):

应该将http改为https，

if image.get_attribute('src') and 'http' in image.get_attribute('src'):

赞(0）回复(0）举报 2023-09-28

我来回答

Chrome 我的Python代码，以网页抓取和下载五个图像是不工作,使用Blender(3D)作为IDE

1条答案

相关问题

热门标签

最新问答