如何创建时间序列滑动窗口tensorflow数据集，其中某些要素的批处理大小与其他要素不同？

ogq8wdun 于 2022-11-25 发布在其他

关注(0)|答案(1)|浏览(175)

目前，我能够创建时间序列滑动窗口批处理数据集，其中包含有序的“特征集”，如“输入”、“目标”、"基准“等。最初，我开发了模型和数据集，其中目标与所有其他输入具有相同的批处理大小。然而，已经证明这对调整输入批量是有害的，当需要在实时数据上运行此函数时，它没有什么帮助，因为我只关心生成(1, horizon, targets)形状的单个样本输出，或者在给定(samples, horizon, features)输入数据集的情况下只生成(horizon, targets)。
作为一个概述，我想在时间T取长度features的N历史样本，在模型中运行它们，并输出长度targets的horizon单个样本;重复此操作，直到运行完整个数据集。
假设Pandas Dataframe 的长度为Z，则所有结果数据集的长度都应为Z - horizon。“targets”数据集的批处理大小应为1，“inputs”数据集的批处理大小应为batch_size。
下面是我目前使用的一个精简的代码片段，用于为所有特性集生成标准批处理大小：

import tensorflow as tf
import pandas as pd

horizon = 5
batch_size = 10
columns = {
    "inputs": ["input_1", "input_2"],
    "targets": ["target_1"],
}
batch_options = {
    "drop_remainder": True,
    "deterministic": True,
}

d = range(100)
df = pd.DataFrame(data={'input_1': d, 'input_2': d, 'target_1': d})

slices = tuple(df[x].astype("float32") for x in columns.values())
data = (
    tf.data.Dataset.from_tensor_slices(slices)
    .window(horizon, shift=1, drop_remainder=True)
    .flat_map(
        lambda *c: tf.data.Dataset.zip(
            tuple(
                col.batch(horizon, **batch_options)
                for col in c
            )
        )
    )
    .batch(
        batch_size,
        **batch_options,
    )
)

tensorflow

来源：https://stackoverflow.com/questions/74552302/how-do-i-create-a-timeseries-sliding-window-tensorflow-dataset-where-some-featur

1条答案

按热度按时间

bvjveswy1#

我们可以创建两个滑动窗口数据集并压缩它们。

inputs = df[['input_1', 'input_1']].to_numpy()
labels = df['target_1'].to_numpy()

window_size = 10
stride =1
data1 = tf.data.Dataset.from_tensor_slices(inputs).window(window_size, shift=stride, drop_remainder=True).flat_map(lambda x: x.batch(window_size))
data2 = tf.data.Dataset.from_tensor_slices(inputs).window(1, shift=stride, drop_remainder=True).flat_map(lambda x: x.batch(1))
data = tf.data.Dataset.zip((data1, data2))

赞(0）回复(0）举报 2022-11-25

我来回答

如何创建时间序列滑动窗口tensorflow数据集，其中某些要素的批处理大小与其他要素不同？

1条答案

相关问题

热门标签

最新问答