numpy 如何在python中使用固定x截距的线性回归?

bis0qfac  于 2023-04-06  发布在  Python
关注(0)|答案(4)|浏览(205)

我已经找到了很多拟合零截距线性回归的例子。
然而,我想用固定的x截距拟合线性回归,换句话说,回归将从特定的x开始。
我有以下代码用于绘图。

import numpy as np
import matplotlib.pyplot as plt

xs = np.array([0.1, 0.2, 0.4, 0.6, 0.8, 1.0, 2.0, 4.0, 6.0, 8.0, 10.0,
              20.0, 40.0, 60.0, 80.0])

ys = np.array([0.50505332505407008, 1.1207373784533172, 2.1981844719020001,
              3.1746209003398689, 4.2905482471260044, 6.2816226678076958,
              11.073788414382639, 23.248479770546009, 32.120462301367183,
              44.036117671229206, 54.009003143831116, 102.7077685684846,
              185.72880217806673, 256.12183145545811, 301.97120103079675])

def best_fit_slope_and_intercept(xs, ys):
    # m = xs.dot(ys)/xs.dot(xs)
    m = (((np.average(xs)*np.average(ys)) - np.average(xs*ys)) /
         ((np.average(xs)*np.average(xs)) - np.average(xs*xs)))
    b = np.average(ys) - m*np.average(xs)
    return m, b

def rSquaredValue(ys_orig, ys_line):
    def sqrdError(ys_orig, ys_line):
        return np.sum((ys_line - ys_orig) * (ys_line - ys_orig))
    yMeanLine = np.average(ys_orig)
    sqrtErrorRegr = sqrdError(ys_orig, ys_line)
    sqrtErrorYMean = sqrdError(ys_orig, yMeanLine)
    return 1 - (sqrtErrorRegr/sqrtErrorYMean)

m, b = best_fit_slope_and_intercept(xs, ys)
regression_line = m*xs+b

r_squared = rSquaredValue(ys, regression_line)
print(r_squared)

plt.plot(xs, ys, 'bo')
# Normal best fit
plt.plot(xs, m*xs+b, 'r-')
# Zero intercept
plt.plot(xs, m*xs, 'g-')
plt.show()

我想要的东西 * 像 * follwing回归线开始于(5,0).

谢谢你。任何和所有的帮助是感激的。

laawzig2

laawzig21#

我已经想了一段时间,我已经找到了一个可能的解决问题的方法。
如果我理解得好,你想找到线性回归模型的斜率和截距,x轴截距固定。
如果是这样的话(假设您希望x轴截距取值forced_intercept),这就好像您在x轴上“移动”所有点-forced_intercept次,然后您强制scikit-learn使用y轴截距等于0。然后您就会有斜率。要找到截距,只需将B从y=ax+b中分离出来并强制点(forced_intercept,0)。当你这样做时,你得到B=-a* forced_intercept(其中 a 是斜率)。在代码中(注意xs整形):

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

xs = np.array([0.1, 0.2, 0.4, 0.6, 0.8, 1.0, 2.0, 4.0, 6.0, 8.0, 10.0,
              20.0, 40.0, 60.0, 80.0]).reshape((-1,1)) #notice you must reshape your array or you will get a ValueError error from NumPy.

ys = np.array([0.50505332505407008, 1.1207373784533172, 2.1981844719020001,
              3.1746209003398689, 4.2905482471260044, 6.2816226678076958,
              11.073788414382639, 23.248479770546009, 32.120462301367183,
              44.036117671229206, 54.009003143831116, 102.7077685684846,
              185.72880217806673, 256.12183145545811, 301.97120103079675])

forced_intercept = 5 #as you provided in your example of (5,0)

new_xs = xs - forced_intercept #here we "move" all the points
model = LinearRegression(fit_intercept=False).fit(new_xs, ys) #force an intercept of 0
r = model.score(new_xs,ys)
a = model.coef_

b = -1 * a * forced_intercept #here we find the slope so that the line contains (forced intercept,0)

print(r,a,b)
plt.plot(xs,ys,'o')
plt.plot(xs,a*xs+b)
plt.show()

希望这是你要找的东西。

bpsygsoo

bpsygsoo2#

也许这种方法会有用。

import numpy as np
import matplotlib.pyplot as plt

xs = np.array([0.1, 0.2, 0.4, 0.6, 0.8, 1.0, 2.0, 4.0, 6.0, 8.0, 10.0,
              20.0, 40.0, 60.0, 80.0])

ys = np.array([0.50505332505407008, 1.1207373784533172, 2.1981844719020001,
              3.1746209003398689, 4.2905482471260044, 6.2816226678076958,
              11.073788414382639, 23.248479770546009, 32.120462301367183,
              44.036117671229206, 54.009003143831116, 102.7077685684846,
              185.72880217806673, 256.12183145545811, 301.97120103079675])

# At first we add this anchor point to the points set.
xs = np.append(xs, [5.])
ys = np.append(ys, [0.])

# Then we prepare the coefficient matrix according docs
# https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.lstsq.html
A = np.vstack([xs, np.ones(len(xs))]).T

# Then we prepare weights for these points. And we put all weights
# equal except the last one (for added anchor point).
# In this example it's weight 1000 times larger in comparison with others.
W = np.diag(np.ones([len(xs)]))
W[-1,-1] = 1000.

# And we find least-squares solution.
m, c = np.linalg.lstsq(np.dot(W, A), np.dot(W, ys), rcond=None)[0]

plt.plot(xs, ys, 'o', label='Original data', markersize=10)
plt.plot(xs, m * xs + c, 'r', label='Fitted line')
plt.show()

z2acfund

z2acfund3#

如果您使用scikit-learn进行线性回归任务,则可以使用intercept_属性定义截距。

lnxxn5zx

lnxxn5zx4#

from matplotlib import pyplot as plt
import numpy as np
from scipy.optimize import curve_fit

X = np.linspace(0,10, 100)
Y = X + np.random.randn(100) + 3.5
lin = lambda x, a: a * x + 3.5
slope = curve_fit(lin, X, Y)[0][0]

plt.plot(X, Y, X, [slope * x + 3.5 for x in X])

相关问题