pytorch ValueError:无效的数据集标识符(无效的数据集标识符)

jhkqcmku  于 2023-08-05  发布在  其他
关注(0)|答案(1)|浏览(264)

我试图使用下面的形状(1000,)代码加载数据集.hdf5,但我遇到了错误ValueError: Invalid location identifier (invalid location identifier)。当我尝试将数据集加载到pytorch数据加载器时,错误会弹出。

with h5py.File(dataset_path, 'r') as f:
    data = f['default']
    print(data.shape)

Ouput:
(1000,)
# Define the dataset
class MyDataset(Dataset):
    def __init__(self, dataset_path):
        super().__init__()
        with h5py.File(dataset_path, 'r') as f:
            self.data = f['default']

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        return self.data[idx]

# Load the dataset
dataset_path = 'dataset.hdf5'

train_dataset = MyDataset(dataset_path)
train_loader = DataLoader(train_dataset, shuffle=True)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-80-c3c741b81eff> in <module>
     22 
     23 train_dataset = MyDataset(dataset_path)
---> 24 train_loader = DataLoader(train_dataset, shuffle=True)

6 frames
h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

/usr/local/lib/python3.9/dist-packages/h5py/_hl/dataset.py in shape(self)
    472 
    473         with phil:
--> 474             shape = self.id.shape
    475 
    476         # If the file is read-only, cache the shape to speed-up future uses.

h5py/h5d.pyx in h5py.h5d.DatasetID.shape.__get__()

h5py/h5d.pyx in h5py.h5d.DatasetID.shape.__get__()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/h5d.pyx in h5py.h5d.DatasetID.get_space()

ValueError: Invalid dataset identifier (invalid dataset identifier)

当我执行此f.get("default")时收到相同的错误

osh3o9ms

osh3o9ms1#

问题是用with阅读HDF文件会导致它在构造函数返回时立即关闭。h5py模块的设计思想是文件保持打开,这样就可以根据需要使用惰性方法而不是预先读取(或写入)数据。

相关问题