detect correct startRow in fopen before textscan
1 Ansicht (letzte 30 Tage)
Ältere Kommentare anzeigen
laurent jalabert
am 26 Nov. 2018
Beantwortet: Etsuo Maeda
am 30 Nov. 2018
Hello,
I have a text file containing 17 columns of data, with a variable string header above the data. The header contains several rows of strings. The number of rows is not fixed, otherwise I will not request to post my question on the forum.
The column of data that I want to import are located at a certain row defined by startRow, but the value of startRow depend on the headers number of rows. How many rows are defining the headers is unknow after using fopen, but must be known when using textscan. So in between, I have to implement an automated detection of startRow, whatever the header above the data.
This is an example of the text file.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
program_20181015.vi
line2
line3
line4
t T V off F I1 V1 I2 V2 Li1 Li2 X1 Y2 X3 Y4 V5 c
6.357780E+2 2.999041E+2 3.500000E-3 0.000000E+0 1.100000E+0 5.000000E-8 1.999990E+101 -5.000000E-12 1.000000E-4 7.140000E-6 -9.620000E-6 2.395640E-1 -4.995750E-2 2.400520E-1 -5.032370E-2 -2.727684E-7 0.000000E+0 0.000000E+0 0.000000E+0 0.000000E+0
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
In this particular example, Line 6 corresponds to the startRow that I want to detect.
the string chains 't T V off F I1 V1 I2 V2 Li1 Li2 X1 Y2 X3 Y4 V5 c' is always the same whatever the content above this line. So this could be nice to detect such string using find function, because data starta hereafter this line.
Of course I can simply set startRow = 6, and it is solved. But depending on user, I have different number of headers rows above the data. So I need to detect startRow automatically.
In forum, I found the interesting try / catch. Maybe it is nice to use it for my purpose. If startRow =1 (because it should be 6), then an error occurs of course. So catch will not be executed.
Here, I would like to implement startRow = startRow +1, and try again. If no error then catch. Or startRow = startRow +1 and try again.
How to do that ?
startRow = 1;
try
delimiter = '\t';
formatSpec = '%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%[^\n\r]';
fileID = fopen(fichier,'r');
catch me
dataArray = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'TextType', 'string', 'EmptyValue', NaN, 'HeaderLines' ,startRow-1, 'ReturnOnError', false, 'EndOfLine', '\r\n');
fclose(fileID);
end
RAW = importdata(filename,'\t',startRow);
M= RAW.data(:,1:17);
0 Kommentare
Akzeptierte Antwort
Etsuo Maeda
am 28 Nov. 2018
Bearbeitet: Etsuo Maeda
am 28 Nov. 2018
A while loop will help you.
k = 0;
while exist('D') ~= 1
try
D = dlmread('yourfile.txt', '', k, 0);
catch
k = k + 1;
end
end
HTH
0 Kommentare
Weitere Antworten (6)
Etsuo Maeda
am 29 Nov. 2018
Bearbeitet: Etsuo Maeda
am 29 Nov. 2018
Hi laurent jalabert - san,
textscan is little bit different from dlmread.
In case of textscan, its "empty" output exists in every loop and everywhere.
In case of dlmread, its output can exist when it succeed to read numerical data, not text data.
So, "exist('dataArray') ~= 1" cannot work well with your original code.
You can confirm the difference between textscan and dlmread toward unexpected data input with following codes.
fid = fopen('yourfile.txt');
fspec = repmat('%f', [1, 16]);
D = textscan(fid, fspec, 'HeaderLines', 0);
fclose(fid);
and
D = dlmread('yourfile.txt', '', 0, 0);
and 'yourfile.txt'
program_20181015.vi
line2
line3
line4
t T V off F I1 V1 I2 V2 Li1 Li2 X1 Y2 X3 Y4 V5 c
6.357780E+2 2.999041E+2 3.500000E-3 0.000000E+0 1.100000E+0 5.000000E-8 1.999990E+101 -5.000000E-12 1.000000E-4 7.140000E-6 -9.620000E-6 2.395640E-1 -4.995750E-2 2.400520E-1 -5.032370E-2 -2.727684E-7 0.000000E+0 0.000000E+0 0.000000E+0 0.000000E+0
I believe my suggested code in the previous post with dlmread can work well for your data without any modification.
If you need to use textscan function, I can suggest an another way using "isempty" function.
fid = fopen('yourfile.txt');
fspec = repmat('%f', [1, 16]);
k = 0;
D{1, 1} = [];
while isempty(D{1, 1}) == 1
D = textscan(fid, fspec, 'HeaderLines', k);
k = k +1;
end
fclose(fid);
HTH
0 Kommentare
Etsuo Maeda
am 30 Nov. 2018
Bearbeitet: Etsuo Maeda
am 30 Nov. 2018
Hello Laurent - san,
I analyzed your code and finally I found out a bug in "textscan" function with "while" loop!!!
Reproduction steps are following.
aaaaa
bbbbb
ccccc
dddd
eeee
1 2 3
4 5 6
and
clear; close all; fclose all;
fid = fopen('test.txt', 'r');
fspec = '%f%f%f';
k = 0;
while exist('D') ~= 1
try
D = textscan(fid, fspec, 'HeaderLines', k, 'ReturnOnError', false); % k = 3 NO error EMPTY D
catch
k = k +1;
disp(k)
end
end
D = textscan(fid, fspec, 'HeaderLines', k, 'ReturnOnError', false); % k = 3 NO error EMPTY D
fclose(fid);
fid = fopen('test.txt', 'r');
D = textscan(fid, fspec, 'HeaderLines', k, 'ReturnOnError', false); % k = 3 error!!!
fclose(fid)
"k" should be 5 but the while loop stops at 3.
"D" exists but it is empty.
When try to peform textscan again before fclose, it also works but D is empty.
After fclose and 2nd fopen, textscan will show an error with k = 3 and D is not created.
In case of your file, textscan will return strange numbers and stops at k = 4.
It is unexpected behavior of textscan.
I will report your case to the development team in US.
As a workaround, could you please use my 1st code to find the ROW number?
k = 0;
while exist('D') ~= 1
try
D = dlmread('yourfile.txt', '', k, 0);
catch
k = k + 1;
end
end
The numerical data start from 6th line.
"D" will contain numerical data.
"k" will be 5.
Thank you very much for your question and patience.
HTH
0 Kommentare
Etsuo Maeda
am 30 Nov. 2018
Dear Laurent - san,
"dlmread" is a function to read numbers, not for characters.
So, it is impossible to read your header characters using "dlmread".
As you mentioned, "textscan or importdata again with determined k" is one of workaournds.
I think "readtable" is a good tool for you if you know number of the variables.
(The first question was try-catch problem. So I used try-catch statement in my answers before.)
clear; close all;
% R2016b and later
filename = 'yourfile.txt';
numOfVariables = 21;
opts = detectImportOptions(filename, 'Delimiter', '\t', 'NumVariables', numOfVariables);
T = readtable(filename, opts)
T.Properties.VariableNames
If you do not know the number of the variables, the following code may work but I cannot make any promise.
clear; close all;
% R2016b and later
filename = 'yourfile.txt';
opts = detectImportOptions(filename, 'Delimiter', '\t'); % remove NumVariables
T = readtable(filename, opts)
T.Properties.VariableNames
HTH
0 Kommentare
Siehe auch
Kategorien
Mehr zu Text Data Preparation finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!