如何在java中用mleap序列化spark管道

rur96b6h  于 2021-05-27  发布在  Spark
关注(0)|答案(0)|浏览(244)

大家好,我正在编写一个servlet来将pyspark保存的管道转换为序列化的mleap模型。通过这种方式,我可以在生产环境中运行序列化模型,而不需要spark依赖关系。
这是我的密码:

// input data
    log.error("LOAD CSV");

    Dataset<Row> dataset = this.spark.read().format("csv").schema(getSchema()).option("header", "true")
            .option("inferSchema", "true").load("/usr/local/tomcat/csv_data/data.csv");
    dataset.show(1, false);
    log.error("CSV print schema");
    dataset.printSchema();

    log.error("LOAD PIPELINE");
    PipelineModel pipeline = PipelineModel.load("/usr/local/tomcat/models/data_transformation_pipeline");
    Dataset<Row> transformedData = pipeline.transform(dataset);

    log.error("LOAD MODEL");

    PipelineModel model = PipelineModel.load("/usr/local/tomcat/models/regression_model");
    Dataset<Row> prediction = model.transform(transformedData);

    Dataset<Row> result = prediction.select(new Column("kpi_specific").alias("expected"), new Column("prediction"));
    result.show(1, false);

    MleapContext mleapContext = new ContextBuilder().createMleapContext();
    BundleBuilder bundleBuilder = new BundleBuilder();

    bundleBuilder.save(model, new File(
            "jar:file:/usr/local/tomcat/mleap/model.zip"),
            mleapContext);

    Row row = result.first();

    String output = "kpi_specific: " + row.get(0) + " - prediction: " + row.get(1);

我收到以下错误:

error: incompatible types: PipelineModel cannot be converted to Transformer

如何使transformer对象从加载的pipelinemodel开始?
谢谢。

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题