我试图使用下面的形状(1000,)
代码加载数据集.hdf5
,但我遇到了错误ValueError: Invalid location identifier (invalid location identifier)
。当我尝试将数据集加载到pytorch数据加载器时,错误会弹出。
with h5py.File(dataset_path, 'r') as f:
data = f['default']
print(data.shape)
Ouput:
(1000,)
# Define the dataset
class MyDataset(Dataset):
def __init__(self, dataset_path):
super().__init__()
with h5py.File(dataset_path, 'r') as f:
self.data = f['default']
def __len__(self):
return len(self.data)
def __getitem__(self, idx):
return self.data[idx]
# Load the dataset
dataset_path = 'dataset.hdf5'
train_dataset = MyDataset(dataset_path)
train_loader = DataLoader(train_dataset, shuffle=True)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-80-c3c741b81eff> in <module>
22
23 train_dataset = MyDataset(dataset_path)
---> 24 train_loader = DataLoader(train_dataset, shuffle=True)
6 frames
h5py/_objects.pyx in h5py._objects.with_phil.wrapper()
h5py/_objects.pyx in h5py._objects.with_phil.wrapper()
/usr/local/lib/python3.9/dist-packages/h5py/_hl/dataset.py in shape(self)
472
473 with phil:
--> 474 shape = self.id.shape
475
476 # If the file is read-only, cache the shape to speed-up future uses.
h5py/h5d.pyx in h5py.h5d.DatasetID.shape.__get__()
h5py/h5d.pyx in h5py.h5d.DatasetID.shape.__get__()
h5py/_objects.pyx in h5py._objects.with_phil.wrapper()
h5py/_objects.pyx in h5py._objects.with_phil.wrapper()
h5py/h5d.pyx in h5py.h5d.DatasetID.get_space()
ValueError: Invalid dataset identifier (invalid dataset identifier)
当我执行此f.get("default")
时收到相同的错误
1条答案
按热度按时间osh3o9ms1#
问题是用
with
阅读HDF文件会导致它在构造函数返回时立即关闭。h5py
模块的设计思想是文件保持打开,这样就可以根据需要使用惰性方法而不是预先读取(或写入)数据。