python Tensorflow:tf.image.resize是否仍然无法对齐角?

k5hmc34c  于 2023-03-16  发布在  Python
关注(0)|答案(4)|浏览(156)

我在Hackernoon上阅读了blog post,关于Tensorflow'stf.image.resize_area()函数不是反射等变的,所以如果我在数据增加的步骤中调整图像的大小,这可能真的会打乱模型训练。
作者接着说,用户不应该使用任何tf.image.resize函数,因为可能会出现不可预测的行为。这篇文章是2018年1月发表的,所以时间不长。我实际上查看了文章的评论部分,没有人提到问题已经修复。
我只是想知道这些问题是否仍然存在,解决方法是什么?在tensorflow的后续版本中有什么变化吗?比如我可以使用tf.keras增强函数来避免这些问题吗?

ycggw6v2

ycggw6v21#

在我最初阅读了您引用的Hackernoon文章之后,我还看到了this article,它很好地总结了OpenCV、TF 1.X和其他一些DL框架中双线性插值的不同实现。
我在TF 2.0文档中找不到这方面的任何内容,因此我复制了该文章中给出的示例,以测试2.0中的双线性插值。当我使用TensorFlow 2.0运行以下代码时,测试通过。因此,看起来迁移到TF2.0将为您提供与OpenCV实现相匹配的双线性插值实现(因此解决了Hackernoon文章中提出的问题):

def test_tf2_resample_upsample_matches_opencv_methodology():
    """
    According to the article below, the Tensorflow 1.x implementation of bilinear interpolation for resizing images did
    not reproduce the pixel-area-based approach adopted by OpenCV. The `align_corners` option was set to False by
    default due to some questionable legacy reasons but users were advised to set it to True in order to get a
    'reasonable' output: https://jricheimer.github.io/tensorflow/2019/02/11/resize-confusion/
    This appears to have been fixed in TF 2.0 and this test confirms that we get the results one would expect from a
    pixel-area-based technique.

    We start with an input array whose values are equivalent to their column indices:
    input_arr = np.array([
        [[0], [1], [2], [3], [4], [5]],
        [[0], [1], [2], [3], [4], [5]],
    ])

    And then resize this (holding the rows dimension constant in size, but increasing the column dimnesion to 12) to
    reproduce the OpenCV example from the article. We expect this to produce the following output:
    expected_output = np.array([
        [[0], [0.25], [0.75], [1.25], [1.75], [2.25], [2.75], [3.25], [3.75], [4.25], [4.75], [5]],
        [[0], [0.25], [0.75], [1.25], [1.75], [2.25], [2.75], [3.25], [3.75], [4.25], [4.75], [5]],
    ])

    """
    input_tensor = tf.convert_to_tensor(
        np.array([
            [[0], [1], [2], [3], [4], [5]],
            [[0], [1], [2], [3], [4], [5]],
        ]),
        dtype=tf.float32,
    )
    output_arr = tf.image.resize(
        images=input_tensor,
        size=(2,12),
        method=tf.image.ResizeMethod.BILINEAR).numpy()
    expected_output = np.array([
        [[0], [0.25], [0.75], [1.25], [1.75], [2.25], [2.75], [3.25], [3.75], [4.25], [4.75], [5]],
        [[0], [0.25], [0.75], [1.25], [1.75], [2.25], [2.75], [3.25], [3.75], [4.25], [4.75], [5]],
    ])
    np.testing.assert_almost_equal(output_arr, expected_output, decimal=2)
5ssjco0h

5ssjco0h2#

我在一个真实的图像上测试了tf.resize,但是我不能得到相同的图像。所以准备好根据你的训练结果尝试不同的库。查看here的详细信息。

# we may get the same image
image_tf = tf.io.read_file(str(image_path))
image_tf = tf.image.decode_jpeg(image_tf, channels=3, dct_method='INTEGER_ACCURATE')

image_cv = cv2.imread(str(image_path))
image_cv = cv2.cvtColor(image_cv, cv2.COLOR_BGR2RGB)

np.sum(np.abs(image_cv - image_tf)) # 0

# but not the same resized image
image_tf_res = tf.image.resize(image_tf, IMAGE_SIZE, method='bilinear')
image_cv_res = cv2.resize(image_cv, tuple(IMAGE_SIZE), interpolation=cv2.INTER_LINEAR)

# this is NOT 0
np.sum(np.abs(image_pil_res - image_tf_res)), np.sum(np.abs(image_cv_res - image_tf_res))
scyqe7ek

scyqe7ek3#

我刚刚遇到这个问题,并做了一些测试自己使用的图像类似于在Hackernoon article中使用的。你可以找到一个简短的笔记本here与我的发现。截至TF v2.3.1似乎像素移位已被修复,但插值仍然是相当不同的(如着色)相比,PILscikit-image

uyhoqukh

uyhoqukh4#

在TensorFlow中,align_corners有专用参数,您可以检查文档here
下面是一个演示代码

tf.compat.v1.image.resize(
    images,
    size,
    method='bilinear',
    align_corners=True,
)

相关问题