tensorflow 关于使用tf.image.crop_and_resize

xtfmy6hx 于 2022-11-25 发布在其他

关注(0)|答案(5)|浏览(222)

我正在做一个ROI池层，它是为fast-rcnn工作的，我习惯于使用tensorflow 。我发现tf.image.crop_and_resize可以作为ROI池层。
但是我试了很多次，都没有得到我期望的结果。还是说真正的结果正是我得到的？
这是我代码

import cv2
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt 

img_path = r'F:\IMG_0016.JPG'
img = cv2.imread(img_path)
img = img.reshape([1,580,580,3])
img = img.astype(np.float32)
#img = np.concatenate([img,img],axis=0)

img_ = tf.Variable(img) # img shape is [580,580,3]
boxes = tf.Variable([[100,100,300,300],[0.5,0.1,0.9,0.5]])
box_ind = tf.Variable([0,0])
crop_size = tf.Variable([100,100])

#b = tf.image.crop_and_resize(img,[[0.5,0.1,0.9,0.5]],[0],[50,50])
c = tf.image.crop_and_resize(img_,boxes,box_ind,crop_size)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
a = c.eval(session=sess)

plt.imshow(a[0])
plt.imshow(a[1])

然后我把我的原点img和结果交上去：a0、a1
如果我错了，谁能教我如何使用这个功能？谢谢。

tensorflow

来源：https://stackoverflow.com/questions/51843509/about-use-tf-image-crop-and-resize

5条答案

按热度按时间

dpiehjr41#

实际上，Tensorflow在这里没有问题。
从doc到tf.image.crop_and_resize（重点是我的）：
框中：float 32类型的Tensor。shape [num_boxes，4]的2-DTensor。Tensor的第i行指定box_ind[i]图像中的框的坐标，并且在归一化坐标[y1，x1，y2，x2]中指定。y的归一化坐标值被Map到y * 处的图像坐标（image_height - 1），因此归一化图像高度的[0，1]区间被Map到图像高度坐标中的[0，image_height - 1]。我们允许y1〉y2，在这种情况下，采样裁剪是原始图像的上下翻转版本。宽度维度被类似地处理。[0，1]范围，在这种情况下，我们使用extrapolation_value来外推输入图像值。
boxes参数需要规格化坐标，这就是为什么你得到的黑盒子只有第一组坐标[100,100,300,300]（没有规格化，也没有提供外推值），而没有第二组坐标[0.5,0.1,0.9,0.5]。
然而，这就是为什么matplotlib在你第二次尝试时会显示乱码，这只是因为你使用了错误的数据类型。引用matplotlib documentation的plt.imshow（重点是我的）：
所有值都应在[0.. 1]范围内（浮点型）或[0.. 255]范围内（整数型）。超出范围的值将被剪切到这些边界。
当你在[0,1]范围之外使用float时，matplotlib会将你的值绑定到1。这就是为什么你会得到那些彩色像素（纯红色、纯绿色或纯蓝色，或者它们的混合）。将你的数组转换为uint_8，得到一个有意义的图像。

plt.imshow( a[1].astype(np.uint8))

**编辑：**按照要求，我将深入研究tf.image.crop_and_resize。

[当提供非标准化坐标和无外推值时]，为什么我只得到一个空白结果？*

引用文档：
允许[0，1]范围之外的归一化坐标，在这种情况下，我们使用extrapolation_value来外推输入图像值。
因此，[0，1]之外的归一化坐标是允许的。但是它们仍然需要被归一化！在您的示例[100,100,300,300]中，您提供的坐标构成了红色正方形。您的原始图像是左上角的小绿色！参数extrapolation_value的默认值是0。因此原始图像的帧之外的值被推断为[0,0,0]，因此为黑色。

但是如果你的用例需要另一个值，你可以提供它。像素将在每个通道上采用extrapolation_value%256的RGB值。如果你需要裁剪的区域没有完全包含在你的原始图像中，这个选项很有用。（一个可能的用例是滑动窗口）。

赞(0）回复(0）举报 2022-11-25

xhv8bpkk2#

tf.image.crop_and_resize似乎期望像素值在范围[0，1]内。
将代码更改为

test = tf.image.crop_and_resize(image=image_np_expanded/255., ...)

帮我解决了问题。

赞(0）回复(0）举报 2022-11-25

cpjpxq1n3#

还有一个变体是使用tf.central_crop函数。

赞(0）回复(0）举报 2022-11-25

a1o7rhls4#

下面是tf.image.crop_and_resize API.tf版本1.14的具体实现

import tensorflow as tf
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np

tf.enable_eager_execution()

def single_data_2(img_path):
    img = tf.read_file(img_path)
    img = tf.image.decode_bmp(img,channels=1)
    img_4d = tf.expand_dims(img, axis=0)
    processed_img = tf.image.crop_and_resize(img_4d,boxes= 
                   [[0.4529,0.72,0.4664,0.7358]],crop_size=[64,64],box_ind=[0])
    processed_img_2 = tf.squeeze(processed_img,0)
    raw_img_3 = tf.squeeze(img_4d,0)
    return raw_img_3, processed_img_2

def plot_two_image(raw,processed):
    fig=plt.figure(figsize=(35,35))
    raw_ = fig.add_subplot(1,2,1)
    raw_.set_title('Raw Image')
    raw_.imshow(raw,cmap='gray')
    processed_ = fig.add_subplot(1,2,2)
    processed_.set_title('Processed Image')
    processed_.imshow(processed,cmap='gray')

img_path = 'D:/samples/your_bmp_image.bmp'

raw_img, process_img  = single_data_2(img_path)
print(raw_img.dtype,process_img.dtype)
print(raw_img.shape,process_img.shape)
raw_img=tf.squeeze(raw_img,-1)
process_img=tf.squeeze(process_img,-1)
print(raw_img.dtype,process_img.dtype)
print(raw_img.shape,process_img.shape)
plot_two_image(raw_img,process_img)

赞(0）回复(0）举报 2022-11-25

dfuffjeb5#

下面是我的工作代码，同样输出的图像也不是黑色的，这个可以对别人有所帮助

for idx in range(len(bboxes)):
    if bscores[idx] >= Threshold:
      #Region of Interest
      y_min = int(bboxes[idx][0] * im_height)
      x_min = int(bboxes[idx][1] * im_width)
      y_max = int(bboxes[idx][2] * im_height)
      x_max = int(bboxes[idx][3] * im_width)

      class_label = category_index[int(bclasses[idx])]['name']
      class_labels.append(class_label)
      bbox.append([x_min, y_min, x_max, y_max, class_label, float(bscores[idx])])

      #Crop Image - Working Code
      cropped_image = tf.image.crop_to_bounding_box(image, y_min, x_min, y_max - y_min, x_max - x_min).numpy().astype(np.int32)

      # encode_jpeg encodes a tensor of type uint8 to string
      output_image = tf.image.encode_jpeg(cropped_image)
      # decode_jpeg decodes the string tensor to a tensor of type uint8
      #output_image = tf.image.decode_jpeg(output_image)

      score = bscores[idx] * 100

      file_name = tf.constant(OUTPUT_PATH+image_name[:-4]+'_'+str(idx)+'_'+class_label+'_'+str(round(score))+'%'+'_'+os.path.splitext(image_name)[1])

      writefile = tf.io.write_file(file_name, output_image)

赞(0）回复(0）举报 2022-11-25

我来回答

tensorflow 关于使用tf.image.crop_and_resize

5条答案

相关问题

热门标签

最新问答