Import data from a bad format
2 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Hello, I have a set of data and they were saved in a bad format (basically were saved from Python with lists of numpy arrays)
An example data file look like this, each file is supposed to be import into MATLAB as a matrix, where contents in eac [...] goes into each row, for as many row as the number of [...] the file contains. I am having trouble to import these, and it is too expensive to regenerate these data. Could anyone help me please?
*Note: I attached a zip file of an example data file .dat
*Note: I also converted an example from the source data from .dat to .txt to upload here
[0.01643466 0.014102 0.00989389 0.00854453 0.00811339 0.00641578
0.00615053 0.00540413 0.00452342 0.00427268 0.0041174 0.00352849
0.00273467 0.00265508 0.00239323 0.00225965 0.00199268 0.00180934
0.00174052 0.00154865 0.00143824 0.00140056 0.00130063 0.00111959
0.00085831]
[0.01242517 0.00959429 0.00663475 0.00480379 0.0041159 0.00370299
0.00346792 0.00315736 0.00289833 0.00248943 0.00233303 0.00205719
0.00184254 0.0016187 0.00137933 0.00123405 0.00114122 0.00100773
0.00094038 0.00088898 0.00078643 0.00077108 0.0006717 0.00062967
0.00058109]
[ 2.71704623e-03 2.10584618e-03 8.72114136e-04 7.73112590e-04
5.71653378e-04 5.33790412e-04 3.39630885e-04 2.40184459e-04
1.30327127e-04 8.07570547e-05 4.93676189e-05 3.99133858e-05
-6.96552090e-05 -8.84689362e-05 -1.73745252e-04 -1.92295775e-04
-2.88978292e-04 -3.33804546e-04 -4.48600012e-04 -5.03108816e-04
-6.09854318e-04 -6.76489121e-04 -7.41927073e-04 -8.22272102e-04
-1.01214861e-03]
[ 2.48950496e-03 1.32848678e-03 7.77518243e-04 4.46048853e-04
1.82546718e-04 5.68524734e-05 -2.03947611e-05 -1.22789817e-04
-1.42331199e-04 -2.27905262e-04 -2.54901789e-04 -3.21797964e-04
-4.10908018e-04 -4.31102320e-04 -5.76116105e-04 -6.20647464e-04
-6.61513106e-04 -8.03798804e-04 -8.85422390e-04 -9.60254905e-04
-1.05730808e-03 -1.21679564e-03 -1.29680491e-03 -1.65221752e-03
-1.89191346e-03]
[0.01148437 0.00831067 0.00569898 0.00435051 0.00369133 0.00313336
0.00282179 0.00252201 0.00221526 0.0020089 0.00178797 0.00135555
0.00117 0.00106878 0.00099295 0.00081433 0.00073677 0.00068778
0.00068557 0.00063079 0.00057153 0.00053233 0.0004835 0.00046683
0.00042318]
[0.01074849 0.00739927 0.00473212 0.00377076 0.00318848 0.00255984
0.00228395 0.00197474 0.00166971 0.00144228 0.00128842 0.00088904
0.00081689 0.00072367 0.00064738 0.00060256 0.00053549 0.00049838
0.00046984 0.00042499 0.0003706 0.00034885 0.00028414 0.0002643
0.00023334]
........
4 Kommentare
Akzeptierte Antwort
Stephen23
am 11 Mai 2023
Bearbeitet: Stephen23
am 11 Mai 2023
TEXTSCAN is very efficient, and imports numeric data as numeric (i.e. no fiddling around with text):
fmt = repmat('%f',1,25);
fid = fopen('example.txt');
out = textscan(fid,fmt,'EndOfLine',']','Whitespace',' \b\t\r\n[', 'CollectOutput',true);
fclose(fid);
mat = out{1}
Automagically detecting the matrix size also works, but is not documented:
fid = fopen('example.txt');
out = textscan(fid,'','EndOfLine',']','Whitespace',' \b\t\r\n[', 'CollectOutput',true);
fclose(fid);
mat = out{1}
Avoid unnecessary complexity in your code.
2 Kommentare
dpb
am 11 Mai 2023
Good thinking to use the closing bracket as newline @Stephen23; that didn't occur to me in initial response to Walter's counted attempt that fails because the count changes; hence the text processing...
Weitere Antworten (3)
Walter Roberson
am 10 Mai 2023
If it is stored in a file and it is always exactly 25 entries per logical row, then you could use textscan,
PerRow = 25;
fmt = "[" + repmat('%f', 1, PerRow) + "]";
FID = fopen(FILENAME, 'r');
output = cell2mat( textscan(FID, fmt) );
fclose(FID)
3 Kommentare
dpb
am 10 Mai 2023
As requested, attach a section of the text file in a usable format, not as a zipped file..."help us help you!"
dpb
am 11 Mai 2023
Bearbeitet: dpb
am 11 Mai 2023
The '%g' format has struck again -- that's what killed @Walter Roberson's approach. While not the most efficient, a simple way in MATLAB would be
f=readlines('example.txt'); % import as string array
f=strrep(f,"[",""); % remove the brackets
f=strrep(f,"]",""); % remove the brackets
f=join(f); % turn into long string
f=strtrim(split(f)); % convert to array
f=f(strlength(f)>0);
data=str2double(strtrim(split(f))); % convert
whos data
data=reshape(data,[],25).';
data(1:3,:)
Alternatively,
f=readlines('example.txt'); % import as string array
f=split(join(f),']'); % turn into array by section
f=f(strlength(f)>0);
f=strtrim(f);
f=extractAfter(f,"[");
f=f(strlength(f)>0);
data=cell2mat(arrayfun(@(l)str2double(split(strtrim(l))).',f,'uni',0));
[data(1:3,:);data(end-3:end,:)]
2 Kommentare
dpb
am 11 Mai 2023
f=textread('example.txt','%s','delimiter','\n','whitespace','');
f=string(strtrim(f));
then. textread has been deprecated, but it's often still of real use/value where textscan is more trouble to deal with...
Siehe auch
Kategorien
Mehr zu Data Import and Export finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!