我试图实现一个多变量线性回归与梯度下降,但当我尝试这样做:
# Starting values
w = np.ones(3) # The number of features is 3
b = float(0)
def gradient_descent():
global w
global b
learning_rate = 0.0001
for i in range(x_train.shape[0]):
prediction = np.dot(x_train[i], w) + b
error = x_train[i] - prediction
for j in range(w.shape[0]):
w[j] = w[j] - (error * x_train[i][j] * learning_rate)
b = b - (error * learning_rate)
def train():
for i in range(10_000):
gradient_descent()
print(i, ':', w, b)
train()
输出为
0 : [inf inf inf] inf
1 : [inf inf inf] inf
2 : [inf inf inf] inf
3 : [inf inf inf] inf
4 : [inf inf inf] inf
5 : [inf inf inf] inf
6 : [inf inf inf] inf
....
那么我做错了什么呢?我试着降低学习率,但什么也没改变
数据样本:
total_rooms,population,households,bedrooms(target)
5612.0,1015.0,472.0,1283.0
7650.0,1129.0,463.0,1901.0
720.0,333.0,117.0,174.0
1501.0,515.0,226.0,337.0
1454.0,624.0,262.0,326.0
其中,房间、人口和住户总数为形状为(17000,3)的x_train,卧室为形状为(17000,1)的y_train
在拆分数据之前尝试使用sklearn.preprocessing.StandardScaler
缩放数据时
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
train_data = scaler.fit_transform(train_data)
x_train = train_data[:, :3]
y_train = train_data[:, -1]
我得到的是nan
而不是inf
!
注意:无论是否使用sklearn.linear_model.LinearRegression
,数据都可以正常工作
2条答案
按热度按时间dhxwm5r41#
正如评论中所建议的:特征缩放是一个好主意(scikit-learn包括
SimpleScaler
,但是减去每列的平均值并除以标准差也是非常简单的)。此外:误差项似乎向后,残差通常为
prediction - true
。zbsbpyhn2#
无优化或任何担保:归一化和正确应用梯度下降公式会导致类似于