docker TimeEval算法“subsequence lof”,索引错误,每个数据集的维度始终相差99

7lrncoxx  于 2022-11-22  发布在  Docker
关注(0)|答案(1)|浏览(124)

我正在使用TimeEval评估工具进行时间序列异常检测算法。我需要使用subsequence_lof算法,但它总是向我发送索引错误。

Evaluating:   0%|          | 0/1 [00:00<?, ?it/s]Exception occurred during the evaluation of subsequence_lof on the dataset Dataset(datasetId=('CalIt2', 'CalIt2-traffic'), dataset_type='real', training_type=<TrainingType.UNSUPERVISED: 'unsupervised'>, length=5040, dimensions=2, contamination=0.0408730158730158, min_anomaly_length=2, median_anomaly_length=7, max_anomaly_length=19, period_size=48.0, num_anomalies=29).
Traceback (most recent call last):
  File "/Users/.../PycharmProjects/AIS_project/venv/lib/python3.7/site-packages/timeeval/timeeval.py", line 326, in _run
    result = exp.evaluate()
  File "/Users/.../PycharmProjects/AIS_project/venv/lib/python3.7/site-packages/timeeval/core/experiments.py", line 129, in evaluate
    raise last_exception
  File "/Users/.../PycharmProjects/AIS_project/venv/lib/python3.7/site-packages/timeeval/core/experiments.py", line 113, in evaluate
    score = metric(y_true, y_scores)
  File "/Users/.../PycharmProjects/AIS_project/venv/lib/python3.7/site-packages/timeeval/metrics/metric.py", line 42, in __call__
    y_true, y_score = self._validate_scores(y_true, y_score, **kwargs)
  File "/Users/.../PycharmProjects/AIS_project/venv/lib/python3.7/site-packages/timeeval/metrics/metric.py", line 93, in _validate_scores
    y_score[penalize_mask] = (~np.array(y_true[penalize_mask], dtype=bool)).astype(np.int_)
IndexError: boolean index did not match indexed array along dimension 0; dimension is 5040 but corresponding boolean dimension is 4941
Evaluating: 100%|██████████| 1/1 [00:05<00:00,  5.26s/it]

标注永远偏离99。
我的代码:

from pathlib import Path
    
    from timeeval import TimeEval, DefaultMetrics, Algorithm, TrainingType, InputDimensionality, DatasetManager, \
        MultiDatasetManager, ResourceConstraints
    from timeeval.adapters import DockerAdapter
    from timeeval.params import FixedParameters
    
    from timeeval.resource_constraints import GB
    
    dm = MultiDatasetManager([
        Path("timeeval-datasets")  # e.g. ./timeeval-datasets
        # you can add multiple folders with an index-File to the MultiDatasetManager
    ])
    # A DatasetManager reads the index-File and allows you to access dataset metadata,
    # the datasets itself, or provides utilities to filter datasets by their metadata.
    # - select ALL available datasets
    # datasets = dm.select()
    # - select datasets from Daphnet collection
    datasets = dm.select(collection="CalIt2")
    
    algorithms = [
        Algorithm(
            name="subsequence_lof",
            main=DockerAdapter( image_name="registry.gitlab.hpi.de/akita/i/subsequence_lof",
                                tag = "latest",  # usually you can use the default here
                                skip_pull = True
            ),
            #param_config=FixedParameters({"n_neighbors": 20, "random_state": 42}),
            data_as_file=True,
            training_type=TrainingType.UNSUPERVISED,
            input_dimensionality=InputDimensionality.MULTIVARIATE
        ),
    ]
    
    rcs = ResourceConstraints(
            task_memory_limit=4 * GB,
            task_cpu_limit=1.0,
        )
    
    timeeval = TimeEval(dm, datasets, algorithms,  resource_constraints=rcs,metrics=DefaultMetrics.default_list())
    timeeval.run()
    results = timeeval.get_results(aggregated=False)
    print(results)

感谢所有的人,将帮助我=)

2w2cym1i

2w2cym1i1#

我只需要补充一点:

def post_sLOF(scores: np.ndarray, args: dict) -> np.ndarray:
    window_size = args.get("hyper_params", {}).get("window_size", 100)
    return ReverseWindowing(window_size=window_size).fit_transform(scores)

并修改如下:

algorithms = [
        Algorithm(
            name="subsequence_lof",
            main=DockerAdapter( image_name="registry.gitlab.hpi.de/akita/i/subsequence_lof",
                                tag = "latest",  # usually you can use the default here
                                skip_pull = True
            ),
            postprocess=post_sLOF, <-- ADD THIS LINE
            data_as_file=True,
            training_type=TrainingType.UNSUPERVISED,
            input_dimensionality=InputDimensionality.MULTIVARIATE
        ),
    ]

它在README里写的是我的坏:
https://github.com/HPI-Information-Systems/TimeEval-algorithms/blob/main/subsequence_lof/README.md

相关问题