python 自定义sklearn回归器:无法克隆对象...因为构造函数似乎未设置参数

iqxoj9l9  于 2023-04-04  发布在  Python
关注(0)|答案(1)|浏览(138)

我正在尝试实现自己的内核回归,与sklearn库兼容。我的实现如下:

import numpy as np
from sklearn.base import BaseEstimator, ClassifierMixin, TransformerMixin, RegressorMixin
from sklearn.utils.validation import check_X_y, check_array, check_is_fitted
from sklearn.utils.multiclass import unique_labels
from sklearn.metrics import euclidean_distances
import models.kernel as ker
        
        
class MyKerReg(BaseEstimator, RegressorMixin):
    
    def __init__ (self, kernel = "gaussian", bandwidth = 1.0):
        self.kernel = ker.kernel(kernel)
        self.bandwidth = bandwidth
  
        
    def fit(self, X, y):
        
        X, y = check_X_y(X, y, accept_sparse=True, ensure_2d=False)
        self.is_fitted_ = True
        self.X_ = X
        self.y_ = y
        
        return self
        
    def predict(self, X):
        
        X = check_array(X, accept_sparse=True, ensure_2d=False)
        check_is_fitted(self, 'is_fitted_')
        
        pred = []
        for x in X:
            tmp = [x - v for v in self.X_]
            ker_values = [(1/self.bandwidth)*self.kernel(v/self.bandwidth) for v in tmp]
            
            ker_values = np.array(ker_values)
            values = np.array(self.y_)
            
            num = np.dot(ker_values.T, values)
            denom = np.sum(ker_values)
            
            pred.append(num/denom)
        return pred

当我单独调用函数predict时,一切都工作得很好。当在cross_瓦尔_score中使用这个对象时,就像这样...

y, x = misc.data_generating_process(1000)
    x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.2, random_state = 44)
    
    kr = ker_reg.MyKerReg(kernel = "gaussian", bandwidth = 0.5)
    
    print(cross_val_score(kr, x_train, y_train, scoring="neg_mean_squared_error", cv=5))

...我得到以下错误:

Exception has occurred: RuntimeError
Cannot clone object MyKerReg(bandwidth=0.5, kernel=<models.kernel.kernel object at 0x7fab359bc940>), as the constructor either does not set or modifies parameter kernel

During handling of the above exception, another exception occurred:

  File "/home/dragos/Projects/ML_Homework/kernel_regression/main.py", line 24, in main
    print(cross_val_score(kr, x_train, y_train, scoring="neg_mean_squared_error", cv=5))
  File "/home/dragos/Projects/ML_Homework/kernel_regression/main.py", line 85, in <module>
    main()

有人知道怎么解决这个问题吗?我知道在这个主题上有类似的趋势,我还是想不出来。谢谢大家。
我已经阅读了关于这个主题的文档和文章,看起来我做的一切都是正确的。

hpxqektj

hpxqektj1#

__init__方法应该将其参数设置为属性,不需要更改名称或验证。在您的示例中,self.kernel = ker.kernel(kernel)是罪魁祸首。您可以将其移动到fit的开头:在init中只保留self.kernel = kernel,在fit中只保留self.kernel_ = ker.kernel(self.kernel)
开发者指南:

__init__接受的每个关键字参数都应该对应于示例上的一个属性。Scikit-learn在进行模型选择时依靠此来查找要在估计器上设置的相关属性。

[...]
不应该有逻辑,甚至不应该有输入验证,参数也不应该改变。相应的逻辑应该放在使用参数的地方,通常在fit中。

相关问题