Extracting specific repeating lines of text after a heading using fgetl and textscan

Question

Vincent Scalfani am 19 Jul. 2016

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/296250-extracting-specific-repeating-lines-of-text-after-a-heading-using-fgetl-and-textscan

Kommentiert: Vincent Scalfani am 21 Jul. 2016

Here is an example of the data I am working with. I would like to extract the line directly following each KEY tag. The files have many thousands of these, so I need to create a loop with textscan or something similar.

> <NAME>
mary
> <AGE>
30
> <KEY>
RDHQFKQIGNG
> <NAME>
john
> <AGE>
56
> <KEY>
JFJNNFNFKFNN

Desired result:

RDHQFKQIGNG
JFJNNFNFKFNN

Here is where I am at (adapted from a similar question in the past), the code does not seem to be moving the cursor, and instead works for the first one, and then grabs all data after it, instead of just the data following the KEY line.

f = fopen('data.txt', 'rt'); 
tline = fgetl(f);
while isempty(strfind(tline, '> <KEY>'))
    if tline == -1 
        break;
    end
    line = fgetl(f);
end
if tline ~= -1
    data = textscan(f,'%s','Delimiter','\r\n');
else
    disp('not found');
end
fclose(f);

Thanks!

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Stephen23 am 19 Jul. 2016

1
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/296250-extracting-specific-repeating-lines-of-text-after-a-heading-using-fgetl-and-textscan#answer_229053

In MATLAB Online öffnen

temp1.txt

>> str = fileread('temp1.txt');
>> C = regexp(str,'(?<=> <KEY>\s+)\S+','match')
C = 
  'RDHQFKQIGNG'    'JFJNNFNFKFNN'

Tested on this file:

3 Kommentare
1 älteren Kommentar anzeigen1 älteren Kommentar ausblenden

Stephen23 am 20 Jul. 2016

In MATLAB Online öffnen

temp1.m

Try this:

  E = regexp(str,'^> <KEY>\s+\S+','match','lineanchors');
  E = strtrim(strrep(E,'> <KEY>',''));

And have a play with this script:

Vincent Scalfani am 21 Jul. 2016

Amazing!!! PERFECT. It took 1 second to process over 4 million lines of text. Thanks so much for your time.

Melden Sie sich an, um zu kommentieren.

Extracting specific repeating lines of text after a heading using fgetl and textscan

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Akzeptierte Antwort

3 Kommentare
1 älteren Kommentar anzeigen1 älteren Kommentar ausblenden

Weitere Antworten (0)

Siehe auch

Kategorien

Tags

Produkte

Community Treasure Hunt

Extracting specific repeating lines of text after a heading using fgetl and textscan

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Akzeptierte Antwort

3 Kommentare 1 älteren Kommentar anzeigen1 älteren Kommentar ausblenden

Weitere Antworten (0)

Siehe auch

Kategorien

Tags

Produkte

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

3 Kommentare
1 älteren Kommentar anzeigen1 älteren Kommentar ausblenden