numpy 如何从混合高斯分布中计算样本属于某一类的概率?

wsewodh2  于 2022-11-10  发布在  其他
关注(0)|答案(1)|浏览(127)

我想知道,对于我们给自己的数据产生的高斯分布的混合,我们如何计算出哪个分量更有可能属于我们给定的新样本?
我了解到,MatLab似乎有可以直接计算的函数,Python语言中有吗?到目前为止,我还没有找到答案。

import matplotlib.pyplot as plt
import numpy as np
import random

# Bivariate example

dim = 2

# Settings

n = 500
NumberOfMixtures = 3

# Mixture weights (non-negative, sum to 1)

w = [0.5, 0.25, 0.25]

# Mean vectors and covariance matrices

MeanVectors = [ [0,0], [-5,5], [5,5] ]
CovarianceMatrices = [ [[1, 0], [0, 1]], [[1, .8], [.8, 1]], [[1, -.8], [-.8, 1]] ]

# Initialize arrays

samples = np.empty( (n,dim) ); samples[:] = np.NaN
componentlist = np.empty( (n,1) ); componentlist[:] = np.NaN

# Generate samples

for iter in range(n):
    # Get random number to select the mixture component with probability according to mixture weights
    DrawComponent = random.choices(range(NumberOfMixtures), weights=w, cum_weights=None, k=1)[0]
    # Draw sample from selected mixture component
    DrawSample = np.random.multivariate_normal(MeanVectors[DrawComponent], CovarianceMatrices[DrawComponent], 1)
    # Store results
    componentlist[iter] = DrawComponent
    samples[iter, :] = DrawSample

# Report fractions

print('Fraction of mixture component 0:', np.sum(componentlist==0)/n)
print('Fraction of mixture component 1:',np.sum(componentlist==1)/n)
print('Fraction of mixture component 2:',np.sum(componentlist==2)/n)

# Visualize result

plt.plot(samples[:, 0], samples[:, 1], '.', alpha=0.5)
plt.grid()
plt.show()
h43kikqp

h43kikqp1#

问题已经解决,答案可以在链接中参考:

https://stackoverflow.com/questions/42971126/multivariate-gaussian-distribution-scipy

相关问题