Extracting information from file

Question

0 Stimmen

nohup.txt

I am trying to extract information from the attached file and write them into a matrix with one column each from sample name, number of cells and porosity. I have been trying textscan and sscanf, but am not sure how to search the structure of the text.

3 Kommentare
1 älteren Kommentar anzeigen 1 älteren Kommentar ausblenden

Julia am 8 Mär. 2016

Bearbeitet: per isakson am 8 Mär. 2016

yes, cutTDM400_111_111_000_000_000 is a sample name (and I want to extract the Num of cells and Porosity).
The beginning of a cell is indicated by e.g. Running sample cutTDM400_111_111_000_000_000, after that it is solved three times, but the Num of cells and porosity value are the same.
So the next row would be the values for cutTDM200_111_111_111_000_000.

per isakson am 8 Mär. 2016

Bearbeitet: per isakson am 8 Mär. 2016

Now, I think I understand. The word "cells" in "Num of cells" has nothing to do with the word "cell" in "beginning of a cell is indicated".

First, I thought you wanted to count some kind of "sections" of the file. I missed "Num of cells = 32000000" when I first browsed the file.

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Follow Question

Answer 1

per isakson am 8 Mär. 2016

Bearbeitet: per isakson am 8 Mär. 2016

0 Stimmen

This is one way to read the your file

>> tic, sas = nohup, toc
sas = 
1x1173 struct array with fields:
    SampleName
    NumOfCells
    Porosity
Elapsed time is 27.000338 seconds.
>> ix = find( strcmp( {sas.SampleName}, 'cutTDM050_111_121_221_222_122' ) )
ix =
   583
>> sas(ix).Porosity
ans =
    0.0828    0.0828    0.0828
>> sas(ix).NumOfCells
ans =
      125000      125000      125000

where (in one m-file)

function    sas = nohup
%%    
    str = fileread( 'nohup.txt' ); 
%%
    heading_string  = 'Running Sample';
    trailing_string = '=============================================='; 
    %
    xpr = sprintf( '(?<=%s).+?(?=%s)', heading_string, trailing_string );
    cac = regexp( str, xpr, 'match' );
%% 
    sas = struct( 'SampleName',repmat({''},[1,length(cac)]) ...
                , 'NumOfCells',{[]}, 'Porosity', {[]}       );
    for jj = 1 : length( cac )
        sas(jj) = nohup_( cac{jj} ); 
    end
end
function    sas = nohup_( str )
    %
    sas.SampleName ... 
    =   regexp( str, 'cutTDM\d{3}_\d{3}_\d{3}_\d{3}_\d{3}_\d{3}', 'match', 'once' );
    %
    cac = regexp( str, '(?<=Num of cells +\= *)\d+', 'match' ); 
    sas.NumOfCells = str2double( cac );
    %
    cac = regexp( str, '(?<=Porosity +\= *)[\d+\.]+', 'match' ); 
    sas.Porosity = str2double( cac );
end

&nbsp

Comments:

The function is slow. Nearly all the time is spend with regexp searching for "Num of cells" and "Porosity". "the Num of cells and porosity value are the same." may be used improve speed. Adding 'once' to these two calls of regexp increases the speed forty times. That's much more than I anticipated; I don't understand; I cannot see what's taking all the extra time.

>> tic, sas = nohup, toc
sas = 
1x1173 struct array with fields:
    SampleName
    NumOfCells
    Porosity
Elapsed time is 0.645206 seconds.
>> ix = find( strcmp( {sas.SampleName}, 'cutTDM050_111_121_221_222_122' ) )
ix =
   583
>> sas(ix).Porosity
ans =
    0.0828
>> sas(ix).NumOfCells
ans =
      125000
>>

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

Julia am 8 Mär. 2016

Great, thank you very much.

Melden Sie sich an, um zu kommentieren.

Extracting information from file

3 Kommentare
1 älteren Kommentar anzeigen 1 älteren Kommentar ausblenden

Akzeptierte Antwort

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

Weitere Antworten (0)

Kategorien

Tags

Community Treasure Hunt

Extracting information from file

3 Kommentare 1 älteren Kommentar anzeigen 1 älteren Kommentar ausblenden

Akzeptierte Antwort

1 Kommentar -1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

Weitere Antworten (0)

Kategorien

Tags

Siehe auch

Community Treasure Hunt

3 Kommentare
1 älteren Kommentar anzeigen 1 älteren Kommentar ausblenden

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden