在Keras / Tensorflow中使用Huggingface imagenet-1 k数据集时出现问题

yv5phkfx 于 2023-05-29 发布在其他

关注(0)|答案(1)|浏览(226)

我在使用Huggingface的imagenet-1 k数据集和Keras模型时遇到了问题。我只是在用简单的模型做实验，但我一直试图让数据集与模型拟合函数一起工作。
下面是我如何加载数据集：

ds = load_dataset('imagenet-1k')  # loads a DatasetDict
ds_train = ds['train']  # get a Dataset
ds_train.set_format(type='tensorflow', columns=['image'])  # convert to tf tensor
ds_val = ds['validation']  # get a Dataset
ds_val.set_format(type='tensorflow', columns=['image'])  # convert to tf tensor

下面是fit调用：

# train the autoencoder
autoencoder.fit(ds_train, ds_train,
                epochs=10,
                shuffle=True,
                validation_data=(ds_val, ds_val))

我得到以下错误：

ValueError: Failed to find data adapter that can handle input: <class 'datasets.arrow_dataset.Dataset'>, <class 'datasets.arrow_dataset.Dataset'>

当我检查数据集的一个元素时，它看起来像一个tf.Tensor，所以我不明白为什么它不能直接传递。我能找到的例子或文档都没有明确说明如何做到这一点。Huggingface examples for images产生的格式与我得到的相同，但显然在它可以与www.example.com（）一起使用之前，我缺少了一个步骤model.fit

keras

来源：https://stackoverflow.com/questions/76040030/problem-using-huggingface-imagenet-1k-dataset-in-keras-tensorflow

1条答案

按热度按时间

iezvtpos1#

您需要将数据集转换为模型可以读取的格式。一种选择是使用TensorFlow数据集，其中每个示例都是元组(image, label)。
您可以使用方法to_tf_dataset()将Hugging Face数据集转换为TensorFlow数据集，如下所示：
ds_train = ds["train"].to_tf_dataset(batch_size=BATCH_SIZE)
这将返回每个示例{"image": tf.Tensor(image), "label": tf.Tensor(label)}的字典。您可以应用其他函数来Map数据集：

def format_dataset(example):
    image = example["image"]
    label = example["label"]
    return (image, label)

ds_train = ds_train.map(format_dataset, num_parallel_calls=tf.data.AUTOTUNE)

这样，每个示例都会有一个元组(image, label)，其中image和label都是tf.Tensor()。

赞(0）回复(0）举报 2023-05-29

我来回答

在Keras / Tensorflow中使用Huggingface imagenet-1 k数据集时出现问题

1条答案

相关问题

热门标签

最新问答