docker 使用Sagemaker SDK部署自定义管道

1l5u6lss 于 2023-04-11 发布在 Docker

关注(0)|答案(2)|浏览(115)

我一直很难将我本地训练的SKlearn模型（自定义代码+逻辑模型的管道）部署到Sagemaker Endpoint。我的管道如下：

所有这些自定义代码（RecodeCategorias）所做的就是规范化并将一些类别列重新编码为“其他”值，用于某些功能：

class RecodeCategorias(BaseEstimator, TransformerMixin):

def __init__(self, feature, categs, exclude=True):
    self.feature = feature
    self.categs = categs
    self.exclude = exclude

def fit(self, X, y=None):
    return self

def transform(self, X, y=None):
    X[self.feature] = X[self.feature].str.lower().str.strip()
    if self.exclude is True:
        X[self.feature] = np.where(
            (X[self.feature].isin(self.categs)) & (~X[self.feature].isna()),
            "outro",
            X[self.feature],
        )
    elif self.exclude is False:
        X[self.feature] = np.where(
            (X[self.feature].isin(self.categs)) | (X[self.feature].isna()),
            X[self.feature],
            "outro",
        )
    else:
        raise ValueError(
            """PLease set exclude the categs to True (to change the categs to 'others')
            or False (to keep the categs and change the remaning to 'others')"""
        )
    return X

我的模型数据保存在tar.gz文件中的S3存储桶中，该文件包含：inference.py、model.joblib和pipeline.joblib。我的部署脚本是：

modelo = SKLearnModel(
model_data='s3://'+s3_bucket+"/"+prefix+"/"+model_path,
role=role,
entry_point="inference.py",
framework_version="1.0-1",
py_version="py3",
sagemaker_session=sagemaker_session,
name="testesdk3",
source_dir='custom_transformers',
dependencies=['custom_transformers/recodefeat.py']
)
try:
    r = modelo.deploy(
             endpoint_name="testesdkendpoint3",
             serverless_inference_config=ServerlessInferenceConfig(
             memory_size_in_mb=4096, max_concurrency=100),
             )
    print(f"Model deploy with name: {modelo.name} and endpoint {modelo.endpoint_name}")
except Exception as e:
   print(e)

重点是我试过了

将类定义添加到model.tar.gz的根目录下的文件中，并将其传递给依赖项（由于相同的文件夹，因此也应该从本地文件中获取相同的文件）
添加到“custom_transformers”到与www.example.com相同目录下的文件夹中inference.py，并将其传递到dependencies或source_dir。

我尝试过AWS Sagemaker SKlearn entry point allow multiple script、AWS Sagemaker SKlearn entry point allow multiple script和https://github.com/aws/amazon-sagemaker-examples/issues/725的解决方案，但似乎都不起作用，总是给予我一个

sagemaker_containers._errors.ClientError: Can't get attribute 'RecodeCategorias' on <module '__main__' from '/miniconda3/bin/gunicorn'>

我到底应该如何传递我的类依赖关系才能正确加载它？
谢谢

docker

来源：https://stackoverflow.com/questions/75768789/deploy-a-custom-pipeline-using-sagemaker-sdk

2条答案

按热度按时间

xhv8bpkk1#

最好使用Boto3（Python SDK）来执行此操作，因为它是更低级别的。在您的model.tar.gz中，您希望捕获任何joblib工件。似乎您的问题是在您的推理脚本中，您没有正确阅读这些工件。对于SKLearn，您需要遵守四个默认处理程序函数（MMS模型服务器实现这些处理程序）。推理脚本的示例如下：

import joblib
import os
import json

"""
Deserialize fitted model
"""
def model_fn(model_dir):
    model = joblib.load(os.path.join(model_dir, "model.joblib"))
    return model

"""
input_fn
    request_body: The body of the request sent to the model.
    request_content_type: (string) specifies the format/variable type of the request
"""
def input_fn(request_body, request_content_type):
    if request_content_type == 'application/json':
        request_body = json.loads(request_body)
        inpVar = request_body['Input']
        return inpVar
    else:
        raise ValueError("This model only supports application/json input")

"""
predict_fn
    input_data: returned array from input_fn above
    model (sklearn model) returned model loaded from model_fn above
"""
def predict_fn(input_data, model):
    return model.predict(input_data)

"""
output_fn
    prediction: the returned value from predict_fn above
    content_type: the content type the endpoint expects to be returned. Ex: JSON, string
"""

def output_fn(prediction, content_type):
    res = int(prediction[0])
    respJSON = {'Output': res}
    return respJSON

特别是在你的model_fn中，你想加载你的joblib文件。model_fn加载你训练的工件，然后你可以在predict_fn中使用。请将你的推理脚本重新构造为这种格式，如果你遇到同样的问题，请告诉我。
关于SageMaker上预训练的sklearn部署的博客：https://towardsdatascience.com/deploying-a-pre-trained-sklearn-model-on-amazon-sagemaker-826a2b5ac0b6

赞(0）回复(0）举报 2023-04-11

kninwzqo2#

事实证明，问题只是我在训练脚本中创建了我的类，而不是从其他地方导入它。在将我的类设置为导入到训练中之后，在推理脚本中遵循相同的文件夹层次结构使其工作正常。

赞(0）回复(0）举报 2023-04-11