scipy 稀疏矩阵与三维数组相乘

tv6aics1 于 2023-03-08 发布在其他

关注(0)|答案(2)|浏览(129)

我有以下矩阵

a = sp.random(150, 150)
x = np.random.normal(0, 1, size=(150, 20))

我基本上想实现以下公式

$\sum_{ij} A_{ij} (x_i - x_j)^2$

我可以这样计算内部差异

diff = (x[:, None, :] - x[None, :, :]) ** 2
diff.shape  # -> (150, 150, 20)

a.shape  # -> (150, 150)

我基本上想广播scipy稀疏矩阵和每个内部numpy数组之间的元素乘法。
如果A是稠密的，那么我可以简单地

np.einsum("ij,ijk->k", a.toarray(), (x[:, None, :] - x[None, :, :]) ** 2)

但是A是稀疏的，而且可能很大，所以这不是一个选项，当然，我可以重新排序轴，然后用for循环遍历diff数组，但是使用numpy有更快的方法吗？
正如@hpaulj所指出的，当前的解决方案也形成了一个(150, 150, 20)形状的数组，这也会立即导致内存问题，所以这个解决方案也不会是好的。

scipy

来源：https://stackoverflow.com/questions/75660016/multiply-scipy-sparse-matrix-with-a-3d-numpy-array

2条答案

按热度按时间

nwsw7zdq1#

import numpy as np
import scipy.sparse
from numpy.random import default_rng

rand = default_rng(seed=0)

# \sigma_k = \sum_i^N \sum_j^N A_{i,j} (x_{i,k} - x_{j,k})^2

# Dense method
N = 100
x = rand.integers(0, 10, (N, 2))
A = np.clip(rand.integers(0, 100, (N, N)) - 80, a_min=0, a_max=None)
diff = (x[:, None, :] - x[None, :, :])**2
product = np.einsum("ij,ijk->k", A, diff)

# Loop method
s_loop = [0, 0]
for i in range(N):
    for j in range(N):
        for k in range(2):
            s_loop[k] += A[i, j]*(x[i, k] - x[j, k])**2
assert np.allclose(product, s_loop)

# For any i,j, we trivially know whether A_{i,j} is zero, and highly sparse matrices have more zeros
# than nonzeros. Crucially, do not calculate (x_{i,k} - x_{j,k})^2 at all if A_{i,j} is zero.
A_i_nz, A_j_nz = A.nonzero()
diff = (x[A_i_nz, :] - x[A_j_nz, :])**2
s_semidense = A[A_i_nz, A_j_nz].dot(diff)
assert np.allclose(product, s_semidense)

# You can see where this is going:
A_sparse = scipy.sparse.coo_array(A)
diff = (x[A_sparse.row, :] - x[A_sparse.col, :])**2
s_sparse = A_sparse.data.dot(diff)
assert np.allclose(product, s_sparse)

看起来相当快;这在大约一秒内完成：

N = 100_000_000
print(f'Big test: initialising a {N}x{N} array')
n_values = 10_000_000
A = scipy.sparse.coo_array(
    (
        rand.integers(0, 100, n_values),
        rand.integers(0, N, (2, n_values)),
    ),
    shape=(N, N),
)
x = rand.integers(0, 100, (N, 2))

print('Big test: calculating')
s = A.data.dot((x[A.row, :] - x[A.col, :])**2)

print(s)

赞(0）回复(0）举报 2023-03-08

vjhs03f72#

请尝试以下代码：

a = np.random.rand(150, 150)
x = np.random.normal(0, 1, size=(150, 20))
diff = (x[:, None, :] - x[None, :, :]) ** 2
diff.shape  # -> (150, 150, 20)

a.shape  # -> (150, 150)
b = np.einsum("ij,ijk->k", a, (x[:, None, :] - x[None, :, :]) ** 2)
result = b.sum()

对于稀疏矩阵，可以使用以下代码：

import scipy.sparse as sp
import numpy as np
a = sp.random(150, 150)
x = np.random.normal(0, 1, size=(150, 20))
diff = (x[:, None, :] - x[None, :, :]) ** 2
diff.shape  # -> (150, 150, 20)

a.shape  # -> (150, 150)
b = np.einsum("ij,ijk->k", a.toarray(), (x[:, None, :] - x[None, :, :]) ** 2)
result = b.sum()

其中a是形状[150，150]的COO（坐标格式）稀疏矩阵

赞(0）回复(0）举报 2023-03-08

我来回答

scipy 稀疏矩阵与三维数组相乘

2条答案

相关问题

热门标签

最新问答