我的理解是,numpy.correlate
和numpy.corrcoef
对于对齐的归一化向量应该产生相同的结果。两个直接的情况正好相反:
from math import isclose as near
import numpy as np
def normalizedCrossCorrelation(a, b):
assert len(a) == len(b)
normalized_a = [aa / np.linalg.norm(a) for aa in a]
normalized_b = [bb / np.linalg.norm(b) for bb in b]
return np.correlate(normalized_a, normalized_b)[0]
def test_normalizedCrossCorrelationOfSimilarVectorsRegression0():
v0 = [1, 2, 3, 2, 1, 0, -2, -1, 0]
v1 = [1, 1.9, 2.8, 2, 1.1, 0, -2.2, -0.9, 0.2]
assert near(normalizedCrossCorrelation(v0, v1), 0.9969260391224474)
print(f"{np.corrcoef(v0, v1)=}")
assert near(normalizedCrossCorrelation(v0, v1), np.corrcoef(v0, v1)[0, 1])
def test_normalizedCrossCorrelationOfSimilarVectorsRegression1():
v0 = [1, 2, 3, 2, 1, 0, -2, -1, 0]
v1 = [0.8, 1.9, 2.5, 2.1, 1.2, -0.3, -2.4, -1.4, 0.4]
assert near(normalizedCrossCorrelation(v0, v1), 0.9809817769512982)
print(f"{np.corrcoef(v0, v1)=}")
assert near(normalizedCrossCorrelation(v0, v1), np.corrcoef(v0, v1)[0, 1])
Pytest输出:
E assert False
E + where False = near(0.9969260391224474, 0.9963146417122921)
E + where 0.9969260391224474 = normalizedCrossCorrelation([1, 2, 3, 2, 1, 0, ...], [1, 1.9, 2.8, 2, 1.1, 0, ...])
E assert False
E + where False = near(0.9809817769512982, 0.9826738919606931)
E + where 0.9809817769512982 = normalizedCrossCorrelation([1, 2, 3, 2, 1, 0, ...], [0.8, 1.9, 2.5, 2.1, 1.2, -0.3, ...])
1条答案
按热度按时间9avjhtql1#
我认为你的公式
np.correlate
是错误的,它没有产生相关系数。考虑第一个例子
不使用浮点数计算的正确答案应该是59 Sqrt[5/17534],近似于
0.99631464171229218403
,这与np.corrcoef
惊人地相同。考虑到
当
a
和b
是相同大小的一维数组时,返回标量积(例如,np.dot(a, b)
)。协方差可以计算(即使不推荐)为E[v0 v1] - E[v0]E[v1]
。这可以作为这等于
np.cov(v0, v1, ddof=0)[0][1]
。所以你可以计算相关性为使用
np.corrcoef
或np.cov
。数学解释
使用
np.correlate
的公式相当于:其中
E
是样本均值。但是相关系数可以计算为