我试图用下面的代码找到Pypark中2个numpy数组的所有行之间的欧几里德距离的平均值:
def distance_matrix(array_t, array_s):
from scipy.spatial import distance
r_count = array_t.shape[0]
c_count = array_s.shape[0]
score_mat = np.empty([r_count, 1])
dist_mat = np.empty([1, c_count])
for r in range(r_count):
for c in range(c_count):
dist_mat[0, c] = distance.euclidean(array_t[r, :], array_s[c, :])
score_mat[r, 0] = np.mean(dist_mat)
return score_mat
dist_mat = distance_matrix(df_T_train_test_nparray, df_S_train_test_nparray)
我如何在Pypark中分发这个?
暂无答案!
目前还没有任何答案,快来回答吧!