numpy 在Python中比较图像的更有效方法

gfttwv5a 于 2023-01-02 发布在 Python

关注(0)|答案(2)|浏览(138)

问题：
我有大约10，000张图片需要相互比较，我目前的程序每秒比较大约60张图片，但以这样的速度，它需要将近9天的运行时间才能完成，我试过使用c++，但最终代码需要的时间是python的3倍。
问题：
有没有更快或更有效的方法来比较图像？我可以使用其他语言和其他库。
代码：

from PIL import Image
from PIL import ImageChops
import math, operator
from functools import reduce
import os

def rmsdiff(image_1, image_2):
    h = ImageChops.difference(image_1, image_2).histogram()
    return math.sqrt(reduce(operator.add, map(lambda h, i: i%256*(h**2), h, range(len(h)))) / (float(image_1.size[0]) * image_1.size[1]))

current = 0
try:
    dire = "C:\\Users\\Nikola\\Downloads\\photos"
    photos = os.listdir(dire)
    for idx, val in enumerate(photos):
        if val == "":
            start = idx
            break
    for photo_1 in range(start,len(photos)):
        if "." not in photos[photo_1]:
            continue
        print(f'Image: {photos[photo_1]}')
        with Image.open(dire+"\\"+photos[photo_1]) as image_1:
            image_1 = image_1.resize((16,16))
            for photo_2 in range(photo_1+1, len(photos)):
                current = photos[photo_2]
                try:
                    if photos[photo_2][-4] != "." and photos[photo_2][-5] != ".":
                        continue
                except:
                    continue
                with Image.open(dire+"\\"+photos[photo_2]) as image_2:
                    image_2 = image_2.resize((16,16))
                    try:
                        value = rmsdiff(image_1, image_2)
                        if value < 12:
                            print(f'Similar Image: {photos[photo_1]}')
                            continue
                    except:
                        pass
except KeyboardInterrupt:
    print()
    print(current)

numpy

来源：https://stackoverflow.com/questions/74974064/more-efficient-way-of-comparing-images-in-python

2条答案

按热度按时间

xkrw2x1b1#

根据我的评论，我建议加载和调整大小需要最多的时间，所以这是我的目标优化。
目前我还没有Python解释器来进行适当的测试，但沿着：

from functools import lru_cache

@lru_cache(maxsize=None)
def loadImage(filename)
    im = Image.open(filename)
    im = im.resize((16,16))
    return im

这应该已经产生了巨大的差异。然后调整到使用“草稿”模式，类似于：

im = Image.open(filename)
    im.draft('RGB',(32,32))
    im = im.resize((16,16)
    return im

如果你的笔记本电脑有一个不错的CPU，你也可以多线程加载。

赞(0）回复(0）举报 2023-01-02

sh7euo9m2#

不过，你的问题很奇怪，你必须读取数据本身才能进行比较，这在大多数情况下是不应该发生的，如果你有一些元数据来进行比较，这将是最有意义的。
也就是说，这里有一些非常不同的方法来加快这一进程。
1.您可以对代码进行并行化，以实现内核数量的加速。
1.你可以把rmse改变成简单的abs（diff）或者其他更便宜的距离函数。来保存大量的计算运行时间。
1.你可以编写自己的diff方法，在超过某个diff阈值时停止计算，这要求函数被编译，或者至少及时编译。
1.如果可以为每个图像预先计算一些维数缩减，则可以在较低的维中执行比较。例如，对行求和，并获得每个图像的总和的列。比较该列而不是整个图像。然后，仅对具有类似较低维表示的图像的整个图像进行计算。
1.如果您的许多图像是相同的，您可以将它们分组，然后在与组中的任何图像进行比较时，您不必对组中的所有其他图像再次进行比较。
1.通过正确使用cdist计算所有到所有距离，可能会获得快速加速

赞(0）回复(0）举报 2023-01-02

我来回答

numpy 在Python中比较图像的更有效方法

2条答案

相关问题

热门标签

最新问答