python 使用scipy optimize curve_fit拟合阶跃位置变化的阶跃函数

nnsrf1az 于 2022-12-17 发布在 Python

关注(0)|答案(3)|浏览(329)

我尝试拟合xy数据，看起来像

x = np.linspace(-2, 2, 1000)
a = 0.5
yl = np.ones_like(x[x < a]) * -0.4 + np.random.normal(0, 0.05, x[x < a].shape[0])
yr = np.ones_like(x[x >= a]) * 0.4 + np.random.normal(0, 0.05, x[x >= a].shape[0])
y = np.concatenate((yl, yr))
plt.scatter(x, y, s=2, color='k')

我用的是赫维赛德阶跃函数的一个变体

def f(x, a, b): return 0.5 * b * (np.sign(x - a))

并配有

popt, pcov = curve_fit(f, x, y, p0=p)

其中p是某个初始猜测。对于任何p，curve_fit仅拟合B而不拟合a，例如：
popt, pcov = curve_fit(f, x, y, p0=[-1.0, 0])我们得到的popt是[-1.，0.20117665]
popt, pcov = curve_fit(f, x, y, p0=[.5, 2])我们得到pop为[.5，0.79902]
popt, pcov = curve_fit(f, x, y, p0=[1.5, -2])我们得到的概率为[1.5，0.40128229]
为什么curve_fit不拟合a？

python

来源：https://stackoverflow.com/questions/59479443/fitting-step-function-with-variation-in-the-step-location-with-scipy-optimize-cu

3条答案

按热度按时间

yacmzcpb1#

正如其他人提到的，curve_fit（以及scipy.optimize中的所有其他解算器）都能很好地优化连续变量，但不能优化离散变量。（例如，在1.e-7级别）更改参数值并查看（如果有的话）对结果的改变，并使用该变化来细化这些值，直到找到最小残差。使用np.sign的模型函数：

def f(x, a, b): return 0.5 * b * (np.sign(x - a))

a值的这种小变化根本不会改变模型或拟合结果，即，首先拟合将尝试例如a=-1.0或a=0.5的起始值，然后会尝试a=-0.999999995或a=0.500000005。这两个都会给予np.sign(x-a)相同的结果。拟合不知道需要将a修改1才能对结果产生影响。它不知道这一点。np.sign()和np.sin()只差一个字母，但在这方面的行为非常不同。
真实的数据采取一个步骤，但采样足够精细，使得该步骤不会完全在一个步骤中发生，这是非常常见的。在这种情况下，您将能够使用各种函数形式对该步骤进行建模（线性斜坡、误差函数、反正切、逻辑、JamesPhilipps给出了一个完整的答案，我可能会使用lmfit（是它的主要作者之一），并愿意猜测参数的起始值从看数据，也许：

import numpy as np

x = np.linspace(-2, 2, 1000)
a = 0.5
yl = np.ones_like(x[x < a]) * -0.4 + np.random.normal(0, 0.05, x[x < a].shape[0])
yr = np.ones_like(x[x >= a]) * 0.4 + np.random.normal(0, 0.05, x[x >= a].shape[0])
y = np.concatenate((yl, yr))

from lmfit.models import StepModel, ConstantModel

model = StepModel() + ConstantModel()
params = model.make_params(center=0, sigma=1, amplitude=1., c=-0.5)

result = model.fit(y, params, x=x)

print(result.fit_report())

import matplotlib.pyplot as plt
plt.scatter(x, y, label='data')
plt.plot(x, result.best_fit, marker='o', color='r', label='fit')
plt.show()

这将给予良好的拟合并打印出以下结果

[[Model]]
    (Model(step, form='linear') + Model(constant))
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 50
    # data points      = 1000
    # variables        = 4
    chi-square         = 2.32729556
    reduced chi-square = 0.00233664
    Akaike info crit   = -6055.04839
    Bayesian info crit = -6035.41737
##  Warning: uncertainties could not be estimated:
[[Variables]]
    amplitude:  0.80013762 (init = 1)
    center:     0.50083312 (init = 0)
    sigma:      4.6009e-04 (init = 1)
    c:         -0.40006255 (init = -0.5)

请注意，它将找到阶跃的center，因为它假设存在某个有限宽度（sigma）到步长，但随后发现宽度小于x中的步长。但还要注意，它不能计算参数中的不确定性，因为如上所述，解附近center（您的a）的微小变化不会改变结果拟合。FWIW StepModel可以使用线性、误差函数、反正切或逻辑作为阶跃函数。
如果您构建的测试数据与步长之间的宽度较小，请使用类似于

from scipy.special import erf    
y = 0.638  * erf((x-0.574)/0.005)  + np.random.normal(0, 0.05, len(x))

则该拟合将能够找到最佳解并评估不确定性。
我希望这能解释为什么与模型函数的拟合不能优化a的值，以及可以对此做些什么。

赞(0）回复(0）举报 2022-12-17

pu82cl6c2#

这是一个使用数据和函数的图形化Python fitter，使用scipy的differential_evolution遗传算法模块提供curve_fit的初始参数估计值。该模块使用拉丁超立方体算法来确保对参数空间的彻底搜索，这需要在边界内进行搜索。在本例中，这些边界取自数据的最大值和最小值。

import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings

# generate data for testing
x = numpy.linspace(-2, 2, 1000)
a = 0.5
yl = numpy.ones_like(x[x < a]) * -0.4 + numpy.random.normal(0, 0.05, x[x < a].shape[0])
yr = numpy.ones_like(x[x >= a]) * 0.4 + numpy.random.normal(0, 0.05, x[x >= a].shape[0])
y = numpy.concatenate((yl, yr))

# alias data to match pervious example
xData = x
yData = y

def func(x, a, b): # variation of the Heaviside step function
    return 0.5 * b * (numpy.sign(x - a))

# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
    warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
    val = func(xData, *parameterTuple)
    return numpy.sum((yData - val) ** 2.0)

def generate_Initial_Parameters():
    # min and max used for bounds
    maxX = max(xData)
    minX = min(xData)

    parameterBounds = []
    parameterBounds.append([minX, maxX]) # search bounds for a
    parameterBounds.append([minX, maxX]) # search bounds for b

    # "seed" the numpy random number generator for repeatable results
    result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
    return result.x

# by default, differential_evolution completes by calling curve_fit() using parameter bounds
geneticParameters = generate_Initial_Parameters()

# now call curve_fit without passing bounds from the genetic algorithm,
# just in case the best fit parameters are aoutside those bounds
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)
print('Fitted parameters:', fittedParameters)
print()

modelPredictions = func(xData, *fittedParameters) 

absError = modelPredictions - yData

SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))

print()
print('RMSE:', RMSE)
print('R-squared:', Rsquared)

print()

##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
    f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
    axes = f.add_subplot(111)

    # first the raw data as a scatter plot
    axes.plot(xData, yData,  'D')

    # create data for the fitted equation plot
    xModel = numpy.linspace(min(xData), max(xData))
    yModel = func(xModel, *fittedParameters)

    # now the model as a line plot
    axes.plot(xModel, yModel)

    axes.set_xlabel('X Data') # X axis data label
    axes.set_ylabel('Y Data') # Y axis data label

    plt.show()
    plt.close('all') # clean up after using pyplot

graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)

赞(0）回复(0）举报 2022-12-17

0ejtzxu13#

或者你可以说重边可以用sigmoïd函数来近似：

$H(x) = 1 / (1 + exp(-2*k*x))$

或者你的情况

$b / (1 + exp(-2 * k * (x - a))) - b / 2$

您添加了一个参数k，但希望最后它足够大，然后删除它以找到其他两个参数。

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

x = np.linspace(-2, 2, 1000)
a = 0.5
yl = np.ones_like(x[x < a]) * -0.4 + np.random.normal(0, 0.05, x[x < a].shape[0])
yr = np.ones_like(x[x >= a]) * 0.4 + np.random.normal(0, 0.05, x[x >= a].shape[0])
y = np.concatenate((yl, yr))
plt.scatter(x, y, s=2, color='k')

# def f(x, a, b): return 0.5 * b * (np.sign(x - a))
def g(x, a, b, k): return b / (1 + np.exp(-2 * k * (x - a))) - b / 2
y_sigmoid = g(x, a, 0.8, 10)
plt.scatter(x, y_sigmoid, s=2, color='g')

popt, pcov = curve_fit(g, x, y, p0=[-1.0, 0, 1])
# popt, pcov = curve_fit(f, x, y, p0=[-1.0, 0])
print(popt)
plt.scatter(x, g(x, *popt), s=2, color='r')

如预期的那样，它给出：[5.02081214e-01****8.03257583e-013.33970547e+03]
（绿色：随机软乙状结肠，红色：曲线拟合结果）

赞(0）回复(0）举报 2022-12-17

我来回答

python 使用scipy optimize curve_fit拟合阶跃位置变化的阶跃函数

3条答案

相关问题

热门标签

最新问答