How to import Text File with 2 different Delimiters (how to organize header data and numeric data)
    5 Ansichten (letzte 30 Tage)
  
       Ältere Kommentare anzeigen
    
I want to import a text file. This contains a header (with space as delimiter) and data (tab delimited).
The txt-file looks like this:
FORMAT TAB_DELIMITED 
NUM_HEADER_BLOCKS 162 
NUM_PARAMS 646 
PT_COUNT.CND_1 3895 
FRAMES.CND_1 16 
FILE_TYPE TIME_HISTORY 
OPERATION RSP_TO_TAB 
DATA_TYPE ASCII_FLOATING_POINT 
DATE Fri Jun 23 11:20:24 2017 
DELTA_T 9.765625e-02 
TOTAL_T 3.803711e+02 
PTS_PER_FRAME 256 
PTS_PER_GROUP 256 
CHANNELS 120 
. 
. 
NUM_ZEROS 5 %end of header with line index 646
RfLongPositionFbk   RfLatPositionFbk     ...... %start of tab delimited area with the data (120 channels)
mm       mm 
-12.6182   -4.071238 
-12.6192   -4.070237 
-12.6182   -4.069237
- I want to search the Line which contains "NUM_PARAMS" and want to read the numeric value, which tell me the size of the header section.
 - After that I want to read the file up to the line 646 in 2 rows - (1st row -> parameter name and 2nd row value.#Then I want to read the data (which is tab delimited - 120 channels).It would be fine if I can rename the channels with the names shown in the line above the units of measurement.
 
I started to read the full txt-file with the following code to import the header and search for the NUM_PARAM:
s = textscan(fid, '%s%s', 'delimiter', ' ');    
idx_NUM_PARAMS = find(strcmp(s{1}, 'NUM_PARAMS'), 1, 'first');      
NUM_PARAMSdbl = str2double(s{1,2}{idx_NUM_PARAMS,1});
But I imported also the data as String which is not usable because of the different delimiter.
So I read out the data in a second step:
 dataTable = readtable(fileName, 'Delimiter', '\t', 'headerLines',NUM_PARAMSdbl+4,'ReadVariableNames',true);
But I cannot name the rows with the channel names, only with the line right above the data (with the units of measurement).
Thank you for every hint how can I solve my problem.
0 Kommentare
Antworten (1)
  Cedric
      
      
 am 1 Nov. 2017
        
      Bearbeitet: Cedric
      
      
 am 1 Nov. 2017
  
      You may not need to use header information for parsing your file. Look at this example (applied to data.txt attached):
content = fileread( 'data.txt' ) ;
% - Split header/data.
pos = strfind( content, 'RfLongPositionFbk' ) ;
header = strtrim( content(1:pos-1) ) ;
data   = content(pos:end) ;
% - Header -> struct with numeric values when possible.
header = regexp( header, '^(\S+)\s+([^\r\n]+)', 'tokens', 'lineanchors' ) ;
header = vertcat( header{:} ) ;
fNames = regexprep( header(:,1), '\W', '_' ) ;
values = strtrim( header(:,2) ) ;
buffer = str2double( values ) ;
isNum  = ~isnan( buffer ) ;
values(isNum) = num2cell( buffer(isNum) ) ;
header = cell2struct( values,fNames ) ;
% - Data -> num array.
data = cell2mat( textscan( data, '%f %f', 'headerlines', 2 )) ;
Running this, you get:
 >> header
 header = 
  struct with fields:
               FORMAT: 'TAB_DELIMITED'
    NUM_HEADER_BLOCKS: 162
           NUM_PARAMS: 646
       PT_COUNT_CND_1: 3895
         FRAMES_CND_1: 16
            FILE_TYPE: 'TIME_HISTORY'
            OPERATION: 'RSP_TO_TAB'
            DATA_TYPE: 'ASCII_FLOATING_POINT'
                 DATE: 'Fri Jun 23 11:20:24 2017'
              DELTA_T: 0.0977
              TOTAL_T: 380.3711
        PTS_PER_FRAME: 256
        PTS_PER_GROUP: 256
             CHANNELS: 120
            NUM_ZEROS: 5
 >> data
 data =
  -12.6182   -4.0712
  -12.6192   -4.0702
  -12.6182   -4.0692
7 Kommentare
  Stephen23
      
      
 am 3 Nov. 2017
				That's now my status:
content = fileread(fileName);
lineStarts = [0, strfind( content, sprintf('\n') )] + 1 ;                                         
numParams_header   = str2double( regexp( content, '(?<=NUM_PARAMS\s+)\S+', 'match', 'once' ));    
header = content(lineStarts(1):(lineStarts(numParams_header+1)-1));                               
channels   = content(lineStarts(numParams_header +3):(lineStarts(numParams_header +4)-1));        
units = content(lineStarts(numParams_header +4):(lineStarts(numParams_header +5)-1));           
data = content(lineStarts(numParams_header +6):end);
How can i convert the channels and units from a sequence of characters to a char array?
I use Matlab R2014a
  Cedric
      
      
 am 3 Nov. 2017
				
      Bearbeitet: Cedric
      
      
 am 3 Nov. 2017
  
			The answer in my comment above does this already. But if you want to follow your current approach, you can use STRSPLIT to get cell arrays of channel names and units (and possibly STRTRIM before, to get rid of \r if STRSPLIT outputs a 121th empty cell).
For the data, I would do it this way:
 data = sscanf( data, '%f' ) ;                    % Long vector of all data.
 data = reshape( data, numel(channels), [] ).' ;  % Reshape into array.
where channels is a cell array of channel names (output of STRSPLIT).
Siehe auch
Kategorien
				Mehr zu Cell Arrays finden Sie in Help Center und File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!