如何使用scipy.sparse.csr_matrix.min忽略隐式零？

tjrkku2a 于 2023-11-19 发布在其他

关注(0)|答案(2)|浏览(123)

目标
我有一个3D空间中大约50万个点的列表。我想找到两个具有最大第一最近邻距离的坐标。

方法

我使用scipy来计算稀疏距离矩阵：

from scipy.spatial import cKDTree

tree = cKDTree(points, 40)
spd = tree.sparse_distance_matrix(tree, 0.01)
spo = spd.tocsr()
spo.eliminate_zeros()

字符串
我消除了显式的零，以考虑对角线元素，其中计算每个点与自身之间的距离。
我现在想找到每行/列中最小距离的坐标，它应该对应于每个点的第一个最近邻居，类似于：

spo.argmin(axis=0)

型
通过找到这个数组中元素的最大距离，我应该能够找到具有最大第一最近邻距离的两个元素。
问题所在
问题是scipy.sparse.csr_matrix的min和argmin函数也考虑了隐式零，这对于这个应用程序来说是不希望的。我如何解决这个问题？对于这个巨大的矩阵，性能和内存都是问题。或者有一种完全不同的方法来实现我想要的功能？

scipy

来源：https://stackoverflow.com/questions/64198981/how-to-ignore-implicit-zeros-with-scipy-sparse-csr-matrix-min

2条答案

按热度按时间

7d7tgy0s1#

我没有找到距离矩阵的解决方案，但似乎我忽略了使用树的query方法的最明显的解决方案。
所以为了找到第一近邻之间的最大距离，我做了（向量是一个形状为（N，3）的numpy数组）：

tree = cKDTree(vectors, leaf_size)
# get the indexes of the first nearest neighbor of each vertex
# we use k=2 because k=1 are the points themselves with distance 0
nn1 = tree.query(vectors, k=2)[1][:,1]
# get the vectors corresponding to those indexes. Basically this is "vectors" sorted by
# first nearest neighbor of each point in "vectors".
nn1_vec = vectors[nn1]
# the distance between each point and its first nearest neighbor
nn_dist = np.sqrt(np.sum((vectors - nn1_vec)**2, axis=1))
# maximum distance
return np.max(nn_dist)

字符串

赞(0）回复(0）举报 2023-11-19

mrphzbgm2#

如果有人发现这个以后（像我）。这里有一个稍微简单的版本。感谢DIN14970做所有的研究。
结果是.query返回的是距离（最初询问时可能不是这种情况）。不需要计算它们。

tree = cKDTree(vectors, leaf_size)
# get the indexes of the first nearest neighbor of each vertex
# we use k=2 because k=1 are the points themselves with distance 0
nn1_distance = tree.query(vectors, k=2)[1][0][:,1]
# maximum distance
return np.max(nn1_distance)

字符串

赞(0）回复(0）举报 2023-11-19

我来回答

如何使用scipy.sparse.csr_matrix.min忽略隐式零？

方法

2条答案

相关问题

热门标签

最新问答