I have a binary file data which I am trying to read. the first data of record is 8 integer numbers. They are precision "int" which means each number is 4 byte long. so I use A = fread(fid,[1 8],'int') it showas numbers A(2) thru A(8) correctly... but number A(1) is some strange number. Also the last number is missing. If I do A = fread(fid, [1,9],'int') and take the numbers A(2) thru A(9) I am able to read all my numbers correctly. SO my question is why the very first number is some unwanted stuff and needs to be skipped to read correctly.

 Akzeptierte Antwort

Greg
Greg am 11 Sep. 2018
Bearbeitet: Greg am 11 Sep. 2018

0 Stimmen

Nobody can possibly answer your question: "why... unwanted stuff...?" Only the author of the binary file format can tell you that. --More specifically, those 4 bytes actually exist in the file. MATLAB isn't misbehaving in any way.
I can tell you that you've already solved your problem. Simply skip 4 bytes and proceed. I bet they are useful somehow, but not likely as an int32. Try other 4-byte formats like float32, or 4*char. Maybe it's a float32 datestamp?

9 Kommentare

Guillaume
Guillaume am 11 Sep. 2018
Bearbeitet: Guillaume am 11 Sep. 2018
For future upgrade, a decent binary format would start with the size of the structure that follows. By reading that structure size, a reader automatically knows which version it's reading.
If the details of the binary format is publicly available, a link would be useful.
Pappu Murthy
Pappu Murthy am 12 Sep. 2018
The problem is that I have no control over Binary file and I really don't know how it was written. However, We have a Fortran program that opens the file with same attributes as in the matlab version and reads A as size 1 thru 8 and all numbers are correct. but Matlab version I have to read 1 thru 9 and take numbers from 2 to 9 as my answer. I do not understand why this difference exists. Also the file is huge and I need to read a lot more stuff involving integers, floating point numbers etc. So I need to know why these difference so that I can program properly in Matlab.
Greg
Greg am 12 Sep. 2018
It's not about having control over the format.
File I/O in FORTRAN is notoriously hideous. Are you absolutely certain you are deciphering the code correctly? There could be a call to fseek long before the actual fread which would be easy to miss.
Guillaume
Guillaume am 12 Sep. 2018
It's normal not having control over the format. However, what's needed is some sort of documentation. The source code of the writer could be use as well. At a pinch, the source code of the decoder.
Otherwise, you're left with reverse engineering the format which involves finding the differences between many files with known content. It's a long process painstaking process.
As I said, well designed formats usually start with a size field. So that may be what these first 4 bytes are. That should be obvious if you compare multiple files, the value should be the same.
You also have to be careful about endianness, although since you say that you're reading some int32 correctly you've probably got that right. For formats that support being written with different endianness, it's typical to use the initial part to mark the encoding.
I would recommend you attach your fortran decoder. Isn't it commented with details of the format?
Pappu Murthy
Pappu Murthy am 12 Sep. 2018
I will include the lines of fortran code here in a couple of hours when I get a chance. In fortran code the file is opened with 'big-Endian't ' So I did in Matlab open also using that as one of the parameters in fopen. If I didn't use that I was getting pure garbage. Now it seems like reading properly but it seems for every different read the first 4 bytes need to be skipped at least it appears so, so far. Fortran program didn't have any problem reading and Like i said in Fortran it is not skipping to read any bytes at all. Soon Here I will provide the lines.
Pappu Murthy
Pappu Murthy am 12 Sep. 2018
Bearbeitet: Guillaume am 12 Sep. 2018
Here is the fortran code that reads the binary file:
outfilename ='prhist.1.hs')
open(12,file=outfilename, form='unformatted',convert='big_endian')
write(6,*) 'opened for processing:',outfilename
write(6,*)'filename= ' outfilename
read(12)nbr,ibk,iek,jbk,jek,isuct(nbr),nprt,ntc
Note: all the eight variables are declared as integers. for e.g. integer nbr, ibk ....
the matlab code I used is as follows:
format long;
clear
clc
% program for Forced Response
% Reading Pressure history file
FileName = 'prhist.1.hs';
fid = fopen(FileName, 'r','b');
frewind(fid);
A = fread(fid,[1 9],'int')
Here is the output
A =
32 1 1 213 1 95 2 162 100
The first number "32", I am not sure what it is. But the numbers 2 thru 9 are correct. Further down I have another read with floating points numbers.. Here to I had to include a dummy statement like
X0 = fread(fid,[1 1],'real*8);
discard X0 and now the rest is properly read. here is the remaining code:
ibk = A(3); iek = A(4); jbk = A(5); jek = A(6);
X0 = fread(fid,[1 1],'real*8');
X = fread(fid,[iek,jek],'real*8');
Y = fread(fid,[iek,jek],'real*8');
Z = fread(fid,[iek,jek],'real*8');
Akx(2:iek,2:jek) = fread(fid,[iek-1,jek-1],'real*8');
Aky(2:iek,2:jek) = fread(fid,[iek-1,jek-1],'real*8');
Akz(2:iek,2:jek) = fread(fid,[iek-1,jek-1],'real*8');
Image Analyst
Image Analyst am 12 Sep. 2018
You say "I really don't know how it was written." Well, maybe not what program wrote it or the source code of that program, but you must at least know the format because you say "I need to read a lot more stuff involving integers, floating point numbers etc." and it will be impossible to get the right bytes into the right variables unless you know the format. Unless you're purely guessing and that usually only works for the very simplest of formats. So since you know the format and what bytes mean what, you probably should know what the first few bytes mean.
Pappu Murthy
Pappu Murthy am 12 Sep. 2018
I agree with you completely. At this point, since I do not have any more info, I am trying to guess and do some reverse engineering. I initially thought if I translated the fortran read statements to matlab read statements I would be able to read the file but now I am finding it is a lot more involved than my simplistic assumption.
Greg
Greg am 14 Sep. 2018
FYI, frewind is redundant immediately after fopen.
Further, fseek is better than reading and trashing unwanted bytes. For a single value it won't make much difference in speed, but the code is cleaner.
*Accidentally posted as answer. Moved to comment *

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Fortran with MATLAB finden Sie in Hilfe-Center und File Exchange

Produkte

Gefragt:

am 11 Sep. 2018

Kommentiert:

am 14 Sep. 2018

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by