H5py 的输入和输出数组

我有一个 Python 代码,它的输出是一个 < img src = “ https://i.stack.imgur.com/keJsR.png”alt = “ enter image description here”> size Matrix,它的条目都是 float类型的。如果我用扩展名 .dat保存它,文件大小为500MB。我读到使用 h5py可以大大减小文件大小。假设我有一个名为 A的2D 数字数组。如何将其保存到 h5py 文件中? Also, how do I read the same file and put it as a numpy array in a different code, as I need to do manipulations with the array?

132970 次浏览

H5py 提供了 数据集团体的模型。前者基本上是数组,后者可以认为是目录。每个都有名字。您应该查看 API 的文档和示例:

http://docs.h5py.org/en/latest/quick.html

一个简单的例子是,您正在预先创建所有数据,并且只想将其保存到一个 hdf5文件中,这样的例子如下:

In [1]: import numpy as np
In [2]: import h5py
In [3]: a = np.random.random(size=(100,20))
In [4]: h5f = h5py.File('data.h5', 'w')
In [5]: h5f.create_dataset('dataset_1', data=a)
Out[5]: <HDF5 dataset "dataset_1": shape (100, 20), type "<f8">


In [6]: h5f.close()

然后,您可以使用以下方法重新加载该数据: '

In [10]: h5f = h5py.File('data.h5','r')
In [11]: b = h5f['dataset_1'][:]
In [12]: h5f.close()


In [13]: np.allclose(a,b)
Out[13]: True

一定要看看文件:

Http://docs.h5py.org

Writing to hdf5 file depends either on h5py or pytables (each has a different python API that sits on top of the hdf5 file specification). You should also take a look at other simple binary formats provided by numpy natively such as np.save, np.savez etc:

Http://docs.scipy.org/doc/numpy/reference/routines.io.html