scipy 链接算法错误：尝试多边形拟合时，SVD在线性最小二乘中不收敛

lsmepo6l 于 2022-12-18 发布在其他

关注(0)|答案(5)|浏览(241)

如果我尝试运行下面的脚本，我得到的错误：LinAlgError: SVD did not converge in Linear Least Squares。我在一个类似的数据集上使用了完全相同的脚本，它在那里工作。我试图在我的数据集中搜索Python可能解释为NaN的值，但我什么也找不到。
我的数据集很大，不可能手工检查。（但我认为我的数据集是好的）。我还检查了stageheight_masked和discharge_masked的长度，但它们是相同的。有人知道为什么我的脚本中有一个错误，我可以做些什么吗？

import numpy as np
import datetime
import matplotlib.dates
import matplotlib.pyplot as plt
from scipy import polyfit, polyval

kwargs = dict(delimiter = '\t',\
     skip_header = 0,\
     missing_values = 'NaN',\
     converters = {0:matplotlib.dates.strpdate2num('%d-%m-%Y %H:%M')},\
     dtype = float,\
     names = True,\
     )

rating_curve_Gillisstraat = np.genfromtxt('G:\Discharge_and_stageheight_Gillisstraat.txt',**kwargs)

discharge = rating_curve_Gillisstraat['discharge']   #change names of columns
stageheight = rating_curve_Gillisstraat['stage'] - 131.258

#mask NaN
discharge_masked = np.ma.masked_array(discharge,mask=np.isnan(discharge)).compressed()
stageheight_masked = np.ma.masked_array(stageheight,mask=np.isnan(discharge)).compressed()

#sort
sort_ind = np.argsort(stageheight_masked)
stageheight_masked = stageheight_masked[sort_ind]
discharge_masked = discharge_masked[sort_ind]

#regression
a1,b1,c1 = polyfit(stageheight_masked, discharge_masked, 2)
discharge_predicted = polyval([a1,b1,c1],stageheight_masked)

print 'regression coefficients'
print (a1,b1,c1)

#create upper and lower uncertainty
upper = discharge_predicted*1.15
lower = discharge_predicted*0.85

#create scatterplot

plt.scatter(stageheight,discharge,color='b',label='Rating curve')
plt.plot(stageheight_masked,discharge_predicted,'r-',label='regression line')
plt.plot(stageheight_masked,upper,'r--',label='15% error')
plt.plot(stageheight_masked,lower,'r--')
plt.axhline(y=1.6,xmin=0,xmax=1,color='black',label='measuring range')
plt.title('Rating curve Catsop')
plt.ylabel('discharge')
plt.ylim(0,2)
plt.xlabel('stageheight[m]')
plt.legend(loc='upper left', title='Legend')
plt.grid(True)
plt.show()

scipy

来源：https://stackoverflow.com/questions/35581644/linalgerror-svd-did-not-converge-in-linear-least-squares-when-trying-polyfit

5条答案

按热度按时间

uinbv5nw1#

我没有你的数据文件，但是当你得到这个错误的时候，你的数据中总是有NaN或者infinity，用pd.notnull或者np.isfinite查找这两个

赞(0）回复(0）举报 2022-12-18

hi3rlvi22#

正如其他人所指出的，* 问题 * 很可能是存在没有数值的行供算法处理，这是大多数回归的问题。
这就是问题所在。那么，解决方案就是做些什么。这取决于数据。通常，你可以用0来代替NaN，比如Pandas.fillna（0）。有时，您可能需要插入缺失值，而Pandas .interpolate（）可能也是最简单的解决方案，或者，当它不是时间序列时，您可以简单地删除其中包含NaN的行，比如使用Pandas.dropna（）方法，或者，有时候它不是关于NaN，而是关于infs或者其他的，还有其他的解决方案：https://stackoverflow.com/a/55293137/12213843
究竟该怎么做，取决于数据，也取决于你对数据的解释，而领域知识在很好地解释数据方面起着很大的作用。

赞(0）回复(0）举报 2022-12-18

nc1teljy3#

正如ski_squaw提到的，大多数时候错误是由于NaN的，但是对我来说，这个错误是在Windows更新之后出现的。我使用的是numpy版本1.16。将我的numpy版本移动到1.19.3解决了这个问题。（在cmd中运行pip install numpy==1.19.3 --user）
这个gitHub问题解释了更多：https://github.com/numpy/numpy/issues/16744
Numpy 1.19.3不能在Linux上运行，1.19.4不能在Windows上运行。

赞(0）回复(0）举报 2022-12-18

cpjpxq1n4#

我在windows8上开发了一个代码。所以现在我用windows10，问题就出现了！就像@Joris说的那样解决了。
管道安装编号==1.19.3

赞(0）回复(0）举报 2022-12-18

yacmzcpb5#

修复后的示例：

def calculating_slope(x):
        x = x.replace(np.inf, np.nan).replace(-np.inf, np.nan).dropna()
        if len(x)>1:
            slope = np.polyfit(range(len(x)), x, 1)[0]
        else: 
            slope = 0
        return slope

赞(0）回复(0）举报 2022-12-18

我来回答

scipy 链接算法错误：尝试多边形拟合时，SVD在线性最小二乘中不收敛

5条答案

相关问题

热门标签

最新问答