在Selenium中抓取网站时保存图像

6jjcrrmo 于 2022-11-29 发布在其他

关注(0)|答案(1)|浏览(154)

我想下载像那些可以找到on this page的图像。
我需要下载所有的图像，每一个一次。
下面是我使用的代码：

links = []
wait = WebDriverWait(driver, 5)
all_images = wait.until(
    EC.presence_of_all_elements_located((By.XPATH, "//div[contains(@class,'swiper-button-next swiper-button-white')]")))

for image in all_images:
    a = image.get_attribute('style')
    b = a.split("(")[1].split(")")[0].replace('"', '')
    links.append(b)

all_images = wait.until(
    EC.presence_of_all_elements_located((By.XPATH, "//div[contains(@class,'swiper-slide swiper-slide-visible swiper-slide-active swiper-slide-thumb-active')]")))

for image in all_images:
    a = image.get_attribute('style')
    b = a.split("(")[1].split(")")[0].replace('"', '')
    links.append(b)

all_images = wait.until(
    EC.presence_of_all_elements_located((By.XPATH, "//div[contains(@class,'swiper-slide swiper-slide-visible')]")))

for image in all_images:
    a = image.get_attribute('style')
    b = a.split("(")[1].split(")")[0].replace('"', '')
    links.append(b)

index = 1
for i in range(len(links)//2 + 1):
    with open(title.replace(' ', '-') + str(index) + '.jpg', 'wb') as file:
        im = requests.get(links[i])
        file.write(im.content)
        print('Saving image.. ', title + str(index))
    index += 1

问题是，这会反复保存图像，而不保存其他一些图像，我不知道如何修复它。

selenium

来源：https://stackoverflow.com/questions/69242320/saving-images-while-crawling-website-in-selenium

1条答案

按热度按时间

wsxa1bj11#

您使用了错误的定位器。
另外，presence_of_all_elements_located并不等待所有元素，它等待至少一个元素的存在。
此外，元素的存在等待元素的存在，而这可能是不够的。建议使用visibility_of_element_located代替。
我认为下面的代码会更好地工作：

links = []
wait = WebDriverWait(driver, 20)
wait.until(EC.visibility_of_element_located((By.XPATH, "//div[contains(@class,'swiper-slide')]")))
time.sleep(0.5)
all_images = driver.find_elements_by_xpath("//div[contains(@class,'swiper-slide')]")

赞(0）回复(0）举报 2022-11-29

我来回答

在Selenium中抓取网站时保存图像

1条答案

相关问题

热门标签

最新问答