how to skip lines that start with a certain character while reading a text file

27 Ansichten (letzte 30 Tage)
I have a text file with two coloumns for a certain amount of rows. The coloumns are then divided from a text line that start with #, how can I load only the data by removing the # line?
  4 Kommentare
christian_00
christian_00 am 18 Jun. 2024
I'm sorry, first time here, I put it in the question page below the "release" option but maybe others can't see it
dpb
dpb am 18 Jun. 2024
Hmmmm....I don't see the release information on the Q?; I do use the compact format, but I'd think it still should show it if user specified it. I'll have to open another window and see if the alternate....oh! I see; it's over there in the RH column with all that other stuff I never pay attention to, not part of the Q? itself. I'll have to try to remember to go look, but nobody else caught it, either, including the MATHWORKS employee....so it clearly isn't in the most suitable location.
Anyway, did you see my followup Answer given the release? readtable should solve your problem as a one-liner.

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

dpb
dpb am 18 Jun. 2024
With the new information of R2018 that predates all the answers initially given, the easiest high-level toolset will be to use readtable; it goes back to R2013
tData=readtable('yourfile.txt','CommentStyle','#');
Alternatively, as mentioned in earlier sidebar conversation, reverting to the venerable textread would probably be my second choice even though it is now deprecated.

Weitere Antworten (4)

dpb
dpb am 18 Jun. 2024
@Taylor's solution will work, but leaves you with the need to convert the string data to numeric values to use it. For a direct solution, try
data=readmatrix('yourfile.txt','CommentStyle','#');
See <readmatrix> for details. Also readtable supports the same option if a table were desired instead of the array; also particularly if the file does have variable names as the first record.

Taylor
Taylor am 18 Jun. 2024
I would just load the data as a string and use the erase function to remove the "#"
  6 Kommentare
Taylor
Taylor am 18 Jun. 2024
@dpb Update from development on the readlines function: "The other functions mentioned are "formatted" text function that expect some structure of the data. readlines is meant simply to read the lines in the file. Its interface is kept minimal on purpose."
dpb
dpb am 18 Jun. 2024
Bearbeitet: dpb am 20 Jun. 2024
That makes no sense at all to me..."make things as simple as possible, but not too simple".
I suggest the choice should be the user's rather than the developer deciding they shouldn't need to do that and that the request for the additional option be retained.
While I'll agree not all the options available with the other members of the family are appicable to the purpose of readlines, I will argue to the end that skipping whole lines based on comment style is a line-reading functionality (as the subject question illustrates) and should be available.

Melden Sie sich an, um zu kommentieren.


Image Analyst
Image Analyst am 18 Jun. 2024
Try readlines to get each line in a cell array. Then loop over all lines skipping the ones that start with #:
fprintf('Beginning to run %s.m at %s...\n', mfilename, datetime('now','TimeZone','local','Format','HH:mm:ss'));
allLines = readlines('Data3.txt'); % Read whole file into a cell array, each cell being one line.
for k = 1 : numel(allLines)
thisLine = strtrim(allLines{k}); % Strip leading white space, in case there is any.
if startsWith(thisLine, '#')
% Skip lines starting with #
fprintf('Skipping %s\n', thisLine);
else
% Process lines NOT starting with #
fprintf('Processing %s\n', thisLine);
end
end
fprintf('Done running %s.m at %s...\n', mfilename, datetime('now','TimeZone','local','Format','HH:mm:ss'));
  2 Kommentare
dpb
dpb am 21 Jun. 2024
Given the lack of the obvious feature to omit comment lines in readline, the above could be somewhat abbreviated
fprintf('Beginning to run %s.m at %s...\n', mfilename, datetime('now','TimeZone','local','Format','HH:mm:ss'));
allLines = strtrim(readlines('Data3.txt')); % read file, trim lines
allLines(startsWith(allLines,'#')=[]; % remove comment lines
for k = 1 : numel(allLines) % iterate over the remainder
% Process lines NOT starting with #
fprintf('Processing %s\n', thisLine);
end
fprintf('Done running %s.m at %s...\n', mfilename, datetime('now','TimeZone','local','Format','HH:mm:ss'));

Melden Sie sich an, um zu kommentieren.


Image Analyst
Image Analyst am 18 Jun. 2024
Try this:
% Open the file for reading in text mode.
fileID = fopen(fullFileName, 'rt');
% Read the first line of the file.
textLine = strtrim(fgetl(fileID));
lineCounter = 1;
while ischar(textLine)
%fprintf('Read %s\n', textLine);
if startsWith(textLine, '#')
% Skip lines starting with #
fprintf('Skipping %s\n', textLine);
else
% Process lines NOT starting with #
fprintf('Processing %s\n', textLine);
end
% Read the next line.
textLine = fgetl(fileID);
if ~ischar(textLine)
break;
end
textLine = strtrim(fgetl(fileID)); % Strip off white space.
lineCounter = lineCounter + 1;
end
% All done reading all lines, so close the file.
fclose(fileID);

Kategorien

Mehr zu Text Data Preparation finden Sie in Help Center und File Exchange

Produkte


Version

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by