Why does Matlab transpose hdf5 data?
Ältere Kommentare anzeigen
There is an apparent bug in Matlab HDF5 read/write utility that breaks interoperability with other code. Simple array datasets are read/written as the transpose of their actual shape. I imagine this is because Matlab uses column-major (Fortran-style) order, whereas the HDF5 standard uses row-major (C-style) order.
Minimal example that illustrates the problem:
h5create('test.h5', '/dataset', [2,3]);
h5write('test.h5', '/dataset', reshape(1:6,[2,3]))
Running the HDF5 utility h5ls on the output reveals the problem:
$ h5ls test.h5
dataset Dataset {3, 2}
This is not evident if only using the HDF5 tools from within Matlab, since reading the dataset in also transposes it back.
>> h5read('test.h5', '/dataset')
ans =
1 3 5
2 4 6
Matlab should either fix this in future versions or mention the convention in the documentation, since people mostly choose HDF5 for interoperability with other systems, and this can be a tricky bug to find.
In versions:
- h5ls: Version 1.8.14
- Matlab 8.6.0.267246 (R2015b) GLNXA64
1 Kommentar
Daniel Döhring
am 24 Mai 2019
Bearbeitet: Daniel Döhring
am 24 Mai 2019
Actually this bug seems to be still around. In my case, a (pseudo) multiarray of dimensions
is in Matlab internally permuted to
. As a consequence, it is impossible to write back a multiarray in dimensions
, since Matlab does not represent matrices in
manner.
Akzeptierte Antwort
Weitere Antworten (3)
Kameron Harris
am 20 Okt. 2016
Bearbeitet: Kameron Harris
am 20 Okt. 2016
1 Stimme
Kameron Harris
am 20 Okt. 2016
Bearbeitet: Kameron Harris
am 20 Okt. 2016
0 Stimmen
1 Kommentar
James Tursa
am 20 Okt. 2016
The HDF Group intent seems to be that applications should be able to write to the file in a native storage order. This seems reasonable to me, especially from a speed standpoint. Why cripple column-ordered languages (Fortran, MATLAB) with a hard requirement to permute the data each time you read/write?
Kameron Harris
am 20 Okt. 2016
Bearbeitet: Kameron Harris
am 20 Okt. 2016
0 Stimmen
2 Kommentare
James Tursa
am 20 Okt. 2016
Well, so this pretty much answers the question. The HDF Group intended the various applications (Fortran, MATLAB, C, C++, Python, etc) to be able to write to the file in a native storage order and simply list the dimensions of the data in the file in a specified order (slowest changing first ... fastest changing last). It is then incumbent on the user to know what storage order his/her applications use if they are to share data through this file format ... and permute the data accordingly if necessary.
So given this language in the HDF doc, I would say MATLAB is doing everything correctly (but maybe could help the user out with some documentation about interoperability with other languages/applications).
Kameron Harris
am 20 Okt. 2016
Kategorien
Mehr zu HDF5 finden Sie in Hilfe-Center und File Exchange
Produkte
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!