How To Load Multiple Text Files (specific context)
Ältere Kommentare anzeigen
Hello MatLab community, I would like to load many text files (same # of rows and columns) contained in a same folder and compile/stock all 2nd columns in a one matrix.
Here's a example : For 30 text files, the resulting matrix would thus have 30 columns and as many rows as the files contain (specifically, they'd all have 2048 rows).
But here's the catch, there's a multi-lines header (something like 8 lines of header) before the data and the data is separated by a semicolon '' ; ''.
One of the text files is attached as an example.
Also, the names of the text files do NOT follow a certain pattern and they are quite random. I've already asked a very similar question here, but I wasn't considering the header. One helpful guy wrote the script below and I'd like to tweek it a little bit to include the right parameters for the textscan().
% Set input folder
input_folder = 'C:\Users\Cotet\Downloads';
% Read all *.txt files from input folder
% NOTE: This creates a MATLAB struct with a bunch of info about each text file
% in the folder you specified.
files = dir(fullfile(input_folder, '*.txt'));
% Get full path names for each text file
file_paths = fullfile({files.folder}, {files.name});
% Read data from files, keep second column
for i = 1 : numel(file_paths)
% Read data from ith file.
% NOTE: If you're file has a text header, missing data, or
% uses non white-space delimiters, you should check out the
% documentation for textread to determine which options to use.
data = textscan(file_paths{i}, '');
% Save second data column to matrix
% NOTE: Your data files all need to have the same number of rows for this to work
A(:, i) = data(:, 2);
end
The part with which I'm concerned is this note :
% NOTE: If you're file has a text header, missing data, or
% uses non white-space delimiters, you should check out the
% documentation for textread to determine which options to use.
I've tried many things, but was ultimately unsuccessful.
Thank you so much in advance.
8 Kommentare
Star Strider
am 7 Jun. 2019
I’m not certain I understand what you’re doing. I also don’t understand the textread reference. You’re not calling textread in the code you posted. If you are using textread, see the param table in Description (link) and use the relevant name-value pair arguments to skip the header lines, define the delimiter, and anything else your files require.
Note that the textscan function with this line:
data = textscan(file_paths{i}, '');
is going to attempt to parse the file name you provided it, similar to the documentation section in Read Floating-Point Numbers (link).
If you want to read the data in the file, you first need to open it with the fopen function and assign it a file ID number. (Remember to close it with fclose after you read it.)
Thomas Côté
am 7 Jun. 2019
What's the issue in following the other poster's sage advice? Here's the beginning of the attached file (I inserted the line numbers)...
1
2 Integration time [ms]: 0.030
3 Averaging Nr. [scans]: 1
4 Smoothing Nr. [pixels]: 0
5 Data measured with spectrometer [name]: 1903395U1
6 Wave ;Sample ;Dark ;Reference;Scope
7 [nm] ;[counts] ;[counts] ;[counts]
8
9 189.95; 383.425; 0.000; 0.000
10 190.09; 416.425; 0.000; 0.000
11 90.24; 439.425; 0.000; 0.000
....
from which it's pretty easy to see there are 8 headerlines. As noted the delimiter is a semicolon so what's the problem with
data = textscan(file_paths{i}, '','headerlines',8,'delimiter',';');
Seems pretty straightforward.
If there aren't always the same number of header lines, then you may have a more difficult issue, but detectImportOptions will parse a file as regular as this easily and then you can use readtable instead or importdata will likely have no issues at an even simpler interface.
The question of how the files are named is something else entirely -- you'll have to have some way to either build a wildcard string that matches the subset you want or build a list manually or have some other way to do the selection on a case-by-case basis--Matlab is smart, but it's not prescient in being able to discern who'w wanted and who's not automagically. As that other respondent noted, his solution works--move the wanted files into their own subdirectory.
Thomas Côté
am 10 Jun. 2019
Thomas Côté
am 10 Jun. 2019
Bob Thompson
am 10 Jun. 2019
Not having a semicolon at the end shouldn't be a problem. I suspect the issue is that there is not semicolon after the row number, so it is trying to treat '9 189.95' as an entire number. I think you would be better off using formatspec instead of marking a delimiter. Something like:
format = '%2.0d %5.2f; %5.2f; %5.2f';
data = textscan(file_paths{i}, format, 'headerlines', 8);
Thomas Côté
am 10 Jun. 2019
Thomas Côté
am 10 Jun. 2019
Bearbeitet: Thomas Côté
am 10 Jun. 2019
Akzeptierte Antwort
Weitere Antworten (0)
Kategorien
Mehr zu Matrix Indexing finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!