ludwig 有任何理由不支持自动机器学习子命令吗?

ercv8c1e  于 5个月前  发布在  其他
关注(0)|答案(2)|浏览(85)

您的功能请求是否与问题相关?请描述。

我只找到了 init_config 子命令,它等同于 ludwig.automl.create_auto_config,但我希望有一个命令可以直接启动 automl 作业,就像 automl 一样,它应该等同于 ludwig.automl.auto_train

root@22b9afe42cc3:/data# ludwig --help
NumExpr defaulting to 4 threads.
usage: ludwig <command> [<args>]

Available sub-commands:
   train                 Trains a model
   predict               Predicts using a pretrained model
   evaluate              Evaluate a pretrained model's performance
   experiment            Runs a full experiment training a model and evaluating it
   hyperopt              Perform hyperparameter optimization
   serve                 Serves a pretrained model
   visualize             Visualizes experimental results
   collect_summary       Prints names of weights and layers activations to use with other collect commands
   collect_weights       Collects tensors containing a pretrained model weights
   collect_activations   Collects tensors for each datapoint using a pretrained model
   datasets              Downloads and lists Ludwig-ready datasets
   export_torchscript    Exports Ludwig models to Torchscript
   export_triton         Exports Ludwig models to Triton
   export_neuropod       Exports Ludwig models to Neuropod
   export_mlflow         Exports Ludwig models to MLflow
   preprocess            Preprocess data and saves it into HDF5 and JSON format
   synthesize_dataset    Creates synthetic data for testing purposes
   init_config           Initialize a user config from a dataset and targets
   render_config         Renders the fully populated config with all defaults set

ludwig cli runner

positional arguments:
  command     Subcommand to run

optional arguments:
  -h, --help  show this help message and exit
root@22b9afe42cc3:/data#

描述使用场景

作为一个用户,我希望 CLI 支持原生的 automl 以快速触发作业。现在,我必须加载数据集并编写简单的程序来启动作业,如下所示。

import logging
import pprint

from load_util import load_mushroom_edibility
from ludwig.automl import auto_train

mushroom_edibility_df = load_mushroom_edibility()

auto_train_results = auto_train(
    dataset=mushroom_edibility_df,
    target='class',
    time_limit_s=7200,
    tune_for_memory=False
)

pprint.pprint(auto_train_results)

描述您希望的解决方案

ludwig automl --dataset xxx.csv --target "class" --time_limit_s=7200 --hyperopt=true --tune_for_memory=True

描述您考虑过的替代方案

附加上下文

velaa5lx

velaa5lx1#

感谢提出这个问题@Jeffwan。这应该相对容易实现,所以我们会看看是否可以在v0.6中添加它。

col17t5w

col17t5w2#

我认为这是一个很好的主意——同样,我也提交了#1934请求同样的内容。包括在0.6 SGTM中。

相关问题