import numpy as np
from scipy import spatial
# test points
pts = np.random.rand(100_000, 2)
# two points which are fruthest apart will occur as vertices of the convex hull
candidates = pts[spatial.ConvexHull(pts).vertices]
# get distances between each pair of candidate points
dist_mat = spatial.distance_matrix(candidates, candidates)
# get indices of candidates that are furthest apart
i, j = np.unravel_index(dist_mat.argmax(), dist_mat.shape)
print(candidates[i], candidates[j])
# e.g. [ 1.11251218e-03 5.49583204e-05] [ 0.99989971 0.99924638]
# Standalone basic example with random data, simplified example
import numpy as np
from scipy.spatial import distance
# Generate a set of random points
pts = np.random.rand(100, 2)
distances = distance.cdist(pts, pts, 'euclidean')
maxarg = np.unravel_index(distances.argmax(), distances.shape)
print('Matrix indices of the two farthest points: %s' % (maxarg,))
print('Farthest point #1 (coords): %s' % pts[maxarg[0]])
print('Farthest point #2 (coords): %s' % pts[maxarg[1]])
输出示例:
Matrix indices of the two farthest points: (11, 20)
Farthest point #1 (coords): [0.06505425 0.00118619]
Farthest point #2 (coords): [0.96760093 0.97164817]
完整示例,包括可视化
验证码:
# Standalone basic example with random data, including visualization
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D
from scipy.spatial import distance
# Generate a set of random points
pts = np.random.rand(100, 2)
distances = distance.cdist(pts, pts, 'euclidean')
maxarg = np.unravel_index(distances.argmax(), distances.shape)
print('Matrix indices of the two farthest points: %s' % (maxarg,))
print('Farthest point #1 (coords): %s' % pts[maxarg[0]])
print('Farthest point #2 (coords): %s' % pts[maxarg[1]])
# Check that the farthest distance is the same
print(distances.max())
print(distances[(maxarg)])
# Fixed size of the visualization canvas (a square)
plt.rcParams["figure.figsize"] = (10, 10)
fig = plt.figure()
ax = fig.add_subplot(111)
plt.scatter(pts.T[0], pts.T[1])
line = Line2D([pts[maxarg[0]][0], pts[maxarg[1]][0]],
[pts[maxarg[0]][1], pts[maxarg[1]][1]],
color='r')
ax.add_line(line)
plt.show()
输出示例:
Matrix indices of the two farthest points: (11, 20)
Farthest point #1 (coords): [0.06505425 0.00118619]
Farthest point #2 (coords): [0.96760093 0.97164817]
1.3252875045947154
1.3252875045947154
2条答案
按热度按时间bqf10yzr1#
你可以避免计算所有的两两距离,通过观察最远的两个点将作为船体中的顶点出现,然后你可以计算较少点之间的两两距离。
例如,有100,000个点均匀分布在单位正方形中,在我的例子中,船体中只有22个点。
如果你的数据是二维的,你可以在
O(N*log(N))
时间内compute船体,其中N
是点的数量。通过concentration of measure,随着维数的增加,这种方法在许多常见分布中的性能会下降。3b6akqbq2#
计算所有点之间的成对距离,选择最远的两个点。
tl;dr -简化示例,代码:
输出示例:
完整示例,包括可视化
验证码:
输出示例:
为什么我会给出这个答案:
1.@hilberts_drinking_problem提到可以使用简单的成对距离度量,但他发布的代码包含了更复杂的船体方法,对于简单的问题(最多几百个点),
scipy
的距离矩阵就足够了。1.在前面的答案中,没有包括可视化的代码,它对某些用例(验证结果)非常重要,至少在我的情况下是这样。