Read file with non-uniform lines?

Question

bene1 am 25 Okt. 2020

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/625818-read-file-with-non-uniform-lines

Kommentiert: bene1 am 27 Okt. 2020

Hi. I'm a Matlab newbie. I would like to read in a file where the lines have different formats, as below.

% Coordinates
%   Code    ID      X         Y
    C       101     0.001     0.001
    C       102     1.002     0.002
    C       103     1.003     1.003
    C       104     0.004     1.004
% Distances
%   Code    ID      From      To      Dist
    D       201     101       103     1.417
    D       202     102       104     1.414

If the first character is C, use...

A = textscan(fid,'%c %d %f %f')

If the first character is D, use...

A = textscan(fid,'%c %d %d %d %f')

After, I'd like to assign the data to structs (c.id, c.x, c.y, d.id, d.from, d.to, d.dist), but first I think I just need to get it scanned in. Is it possible to apply some logic to reading the file? Thank you.

5 Kommentare
3 ältere Kommentare anzeigen3 ältere Kommentare ausblenden

Walter Roberson am 26 Okt. 2020

In MATLAB Online öffnen

'^\s*C.*$', 'dotexceptnewline', 'lineachors'

or

'(?<=(^|\n))\s*C[^\n]*'

with no additional options needed

bene1 am 26 Okt. 2020

In MATLAB Online öffnen

Great, thanks again. Now have...

C =
  4×1 cell array
    {'    C       101     0.001     0.001←'}
    {'    C       102     1.002     0.002←'}
    {'    C       103     1.003     1.003←'}
    {'    C       104     0.004     1.004←'}

With C as a 4x1, I believe my next step is to extract out the columns. My first thought was

A = textscan(C,'%c %d %f %f')

but I see I can't do that. Looking into cell2struct?

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Walter Roberson am 26 Okt. 2020

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/625818-read-file-with-non-uniform-lines#answer_524468

In MATLAB Online öffnen

Named tokens, I said. Do not extract the lines ahead of time.

FileText = fileread(YourFileName);
Ctokens = regexp(FileText, '^\s*C\s+(?<ID>\d+)\s+(?<X>\S+)\s+(?<Y>\S+)', 'names', 'lineanchors');
%Ctokens will now be a struct array with field names ID, X, and Y, each of which are character vectors.
C.ID = str2double({Ctokens.ID});
C.X = str2double({Ctokens.X});
C.Y = str2double({Ctokens.Y});
Dtokens = regexp(FileText, '^\s*D\s+(?<ID>\d+)\s+(?<From>\d+)\s+(?<To>\d+)\s+(?<Dist>\S+)', 'names', 'lineanchors');
%Dtokens will now be a struct array with field names ID, From, To, Dist, each of which are character vectors.
D.ID = str2double({Dtokens.ID});
D.From = str2double({Dtokens.From});
D.To = str2double({Dtokens.To});
D.Dist = str2double({Dtokens.Dist});

Amount of processing work is pretty minimial. Pretty much all of the effort is in figuring out the proper regexp patterns to use (which can be pretty tricky when there are variant lines.)

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

bene1 am 27 Okt. 2020

Cool, thank you kindly!

Melden Sie sich an, um zu kommentieren.

Answer 2

per isakson am 26 Okt. 2020

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/625818-read-file-with-non-uniform-lines#answer_524413

In MATLAB Online öffnen

>> S = cssm( 'd:\m\cssm\cssm.txt' )
S = 
  1×2 struct array with fields:
    header
    colhead
    Code
    data
>> S(1)
ans = 
  struct with fields:
     header: "Coordinates"
    colhead: ["Code"    "ID"    "X"    "Y"]
       Code: [4×1 string]
       data: [4×3 double]
>> S(2)
ans = 
  struct with fields:
     header: "Distances"
    colhead: ["Code"    "ID"    "From"    "To"    "Dist"]
       Code: [2×1 string]
       data: [2×4 double]

where

function    sas = cssm( ffs )
    
    chr = fileread( ffs );
    str = string( chr );
    str = replace( str, char([13,10]), newline );   % get rid of the carriage return
   
    % split the string into blocks. Use the block header as delimiter. 
    [blk,del] = strsplit( str, '(?m)^\x20*%\x20\w+\x20*\n'  ...      
                        , 'DelimiterType','RegularExpression' );
                    
    blk(1) = [];  % remove empty block before the first delimiter                    
    
    len = numel( del );
    sas(1,len) = struct( 'header',"", 'colhead',"", 'Code',"", 'data',nan );
    
    for jj = 1 : len    % loop over all blocks
        
        sas(jj).header = regexp( del(jj), '\w+', 'match','once' );  % match the name
        
        cac = textscan( blk(jj), "%[^\n]", 1 ); % read the first row
        tmp = strsplit( string(cac{1}) );       % split the row into column headers
        tmp(1) = [];                            % remove the comment character, "%"
        sas(jj).colhead = tmp;
        
        cac = textscan( blk(jj), ['%s',repmat('%f',1,numel(tmp)-1)] ...
                    ,   'Headerlines',1, 'CollectOutput',true );
        sas(jj).Code = string(cac{1});
        sas(jj).data = cac{2};
    end
end

and where cssm.txt contains the data given in of your question.

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

bene1 am 27 Okt. 2020

Thank you for the idea. :-)

Melden Sie sich an, um zu kommentieren.

Read file with non-uniform lines?

5 Kommentare
3 ältere Kommentare anzeigen3 ältere Kommentare ausblenden

Akzeptierte Antwort

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Weitere Antworten (1)

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

Read file with non-uniform lines?

5 Kommentare 3 ältere Kommentare anzeigen3 ältere Kommentare ausblenden

Akzeptierte Antwort

1 Kommentar -1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Weitere Antworten (1)

1 Kommentar -1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

5 Kommentare
3 ältere Kommentare anzeigen3 ältere Kommentare ausblenden

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden