我正在尝试使用地面真实深度图、姿势信息和相机矩阵将一帧从视图1扭曲到视图2。我已经能够删除大多数for循环并将其向量化，除了一个for循环。当扭曲时，由于遮挡，视图1中的多个像素可能被Map到视图2中的单个位置。在这种情况下，我需要选择深度值最低的像素（前景对象）。我无法对这部分代码进行矢量化。任何帮助向量化这个for循环是感激的。

上下文：

我正在尝试将图像扭曲成一个新的视图，给定地面真实姿势，深度和相机矩阵。在计算出扭曲的位置后，我把它们四舍五入。任何建议，以实现反双线性插值也欢迎。我的图像是全高清分辨率。因此，将帧扭曲到新视图需要花费大量时间。如果我可以矢量化，我计划将代码转换为TensorFlow或PyTorch并在GPU上运行。任何其他加速翘曲的建议，或现有的实现也是受欢迎的。

编码：

def warp_frame_04(frame1: numpy.ndarray, depth: numpy.ndarray, intrinsic: numpy.ndarray, transformation1: numpy.ndarray,
                  transformation2: numpy.ndarray, convert_to_uint: bool = True, verbose_log: bool = True):
    """
    Vectorized Forward warping. Nearest Neighbor.
    Offset requirement of warp_frame_03() overcome.
    mask: 1 if pixel found, 0 if no pixel found
    Drawback: Nearest neighbor, collision resolving not vectorized
    """
    height, width, _ = frame1.shape
    assert depth.shape == (height, width)
    transformation = numpy.matmul(transformation2, numpy.linalg.inv(transformation1))

    y1d = numpy.array(range(height))
    x1d = numpy.array(range(width))
    x2d, y2d = numpy.meshgrid(x1d, y1d)
    ones_2d = numpy.ones(shape=(height, width))
    ones_4d = ones_2d[:, :, None, None]
    pos_vectors_homo = numpy.stack([x2d, y2d, ones_2d], axis=2)[:, :, :, None]

    intrinsic_inv = numpy.linalg.inv(intrinsic)
    intrinsic_4d = intrinsic[None, None]
    intrinsic_inv_4d = intrinsic_inv[None, None]
    depth_4d = depth[:, :, None, None]
    trans_4d = transformation[None, None]

    unnormalized_pos = numpy.matmul(intrinsic_inv_4d, pos_vectors_homo)
    world_points = depth_4d * unnormalized_pos
    world_points_homo = numpy.concatenate([world_points, ones_4d], axis=2)
    trans_world_homo = numpy.matmul(trans_4d, world_points_homo)
    trans_world = trans_world_homo[:, :, :3]
    trans_norm_points = numpy.matmul(intrinsic_4d, trans_world)
    trans_pos = trans_norm_points[:, :, :2, 0] / trans_norm_points[:, :, 2:3, 0]
    trans_pos_int = numpy.round(trans_pos).astype('int')

    # Solve occlusions
    a = trans_pos_int.reshape(-1, 2)
    d = depth.ravel()
    b = numpy.unique(a, axis=0, return_index=True, return_counts=True)
    collision_indices = b[1][b[2] >= 2]  # Unique indices which are involved in collision
    for c1 in tqdm(collision_indices, disable=not verbose_log):
        cl = a[c1].copy()  # Collision Location
        ci = numpy.where((a[:, 0] == cl[0]) & (a[:, 1] == cl[1]))[0]  # Colliding Indices: Indices colliding for cl
        cci = ci[numpy.argmin(d[ci])]  # Closest Collision Index: Index of the nearest point among ci
        a[ci] = [-1, -1]
        a[cci] = cl
    trans_pos_solved = a.reshape(height, width, 2)

    # Offset both axes by 1 and set any out of frame motion to edge. Then crop 1-pixel thick edge
    trans_pos_offset = trans_pos_solved + 1
    trans_pos_offset[:, :, 0] = numpy.clip(trans_pos_offset[:, :, 0], a_min=0, a_max=width + 1)
    trans_pos_offset[:, :, 1] = numpy.clip(trans_pos_offset[:, :, 1], a_min=0, a_max=height + 1)

    warped_image = numpy.ones(shape=(height + 2, width + 2, 3)) * numpy.nan
    warped_image[trans_pos_offset[:, :, 1], trans_pos_offset[:, :, 0]] = frame1
    cropped_warped_image = warped_image[1:-1, 1:-1]
    mask = numpy.isfinite(cropped_warped_image)
    cropped_warped_image[~mask] = 0
    if convert_to_uint:
        final_warped_image = cropped_warped_image.astype('uint8')
    else:
        final_warped_image = cropped_warped_image
    mask = mask[:, :, 0]
    return final_warped_image, mask

代码解释

我使用等式[1，2]来获得view 2中的像素位置
一旦我有了像素位置，我需要弄清楚是否有任何遮挡，如果有，我必须选择前景像素。
B = numpy.unique（a，axis=0，return_index=True，return_counts=True）给我唯一的位置。
如果视图1中的多个像素Map到视图2中的单个像素（冲突），则“return_counts”将给予一个大于1的值。
collision_indices = B[1][B[2] >= 2]给出了涉及冲突的索引。请注意，这只为每个冲突提供一个索引。
对于每个这样的碰撞点，ci = numpy.其中（（a[：，0] == cl[0]）&（a[：，1] == cl[1]））[0]提供了来自视图1的Map到视图2中的相同点的所有像素的索引。
cci = ci[numpy.argmin（d[ci]）]给出具有最低深度值的像素索引。
a[ci] = [-1，-1]和a[cci] = cl将所有其他背景像素Map到位置（-1，-1），该位置在帧外，因此将被忽略。

[1][https://i.stack.imgur.com/s1D9t.png](https://i.stack.imgur.com/s1D9t.png)
[2][https://dsp.stackexchange.com/q/69890/32876](https://dsp.stackexchange.com/q/69890/32876)

如果你试图在99.9%的情况下进行图像处理（你正在做），你会遇到默认Numpy函数没有覆盖的边缘情况。我不知道如何使用Numpy对代码进行向量化，但你不必这样做。看看Cython。它允许您创建自定义C++扩展（这就是Numpy的真正含义）。您可以从基本Python代码开始，逐步添加类型信息以及禁用Python特定的检查（例如禁用环绕和boundscheck）。这些可能会导致崩溃，所以每次优化一个，并确保测试每一步。如果你的代码是可并行的（在我看来是这样的），并且你对多线程很熟悉，你可以释放GIL（使用nogil：），并将原始数组、偏移量和计数传递给你的Cython函数，以使用不同的线程在共享内存上操作（使用内置的线程池通常工作得很好）。如果你想遵循这条路径，让我知道，这样我就可以添加更多的细节和代码片段到这个答案，或者如果你更喜欢坚持使用Numpy。

2条答案

按热度按时间

z2acfund1#

赞(0）回复(0）举报 2023-10-14

oxf4rvwz2#

我按如下方式实现了这一点。我没有选择最近的点（min），而是使用soft-min，即我取了所有碰撞点的加权平均值，确保深度上的小差异会导致权重上的大差异，并且最近的深度具有最高的权重。我使用np.add.at实现了sum（在soft-min中），就像建议的here一样。
我能够使用torch.Tensor.index_put_将其进一步移植到PyTorch，就像建议的here一样。最后，我将舍入（最近邻插值）替换为双线性溅射（逆双线性插值）。numpy和torch的实现都可以在here上使用。

python PoseWarping：如何向量化for循环(z缓冲区)

上下文：

编码：

代码解释

2条答案

相关问题

热门标签

最新问答