If the file isn't huge (compared to available RAM and address space) and you have an idea of the maximum number of columns "columns" and rows, then I guess the simplest way is to loop over all rows.
M = nan( nrow, ncol );
fid = fopen( ...
str = getl( fid );
row = 0;
while not( eof(fid) )
row = row + 1;
str = fgetl( fid );
val = fscanf( str, '%f' );
M( row, 1:numel(val) ) = val;
end
And trim M. Something like this.
.
[Edit: 2013-01-16]
Working code
Here is a comparison between three solutions. The two first, cssm and cssm1 are along my out-line above. The last, OP, is the one proposed by OP. I run this script a few times.
clc
tic, M1 = cssm; toc
tic, M2 = cssm1( 10000, 100 ); toc
tic, M3 = cssm1( 100000, 1000 ); toc
tic, M4 = OP(); toc
which return
Elapsed time is 0.238691 seconds.
Elapsed time is 0.131869 seconds.
Elapsed time is 0.960397 seconds.
Elapsed time is 0.709025 seconds.
The output is
>> whos
Name Size Bytes Class Attributes
M1 2464x21 413952 double
M2 2464x21 413952 double
M3 2464x21 413952 double
M4 2464x21 413952 double
.
In cssm.m the required number of rows and columns are determined in two separate steps. Each step reads the file. Thus, the function, cssm, reads the file three time.
With cssm1 the number of rows and columns are guessed. In one case the "guesses" are 4x the actual size and in the other 40x.
The function, OP, is OP's code made into a function and ZEROS replaced by NAN to honor the question.
With 2500 rows cssm is three times faster than the loop-free code (OP). cssm is five times faster when allocating 4x4 times more memory than needed and a bit slower than the loop-free code when allocating 40x40 timed more memory.
Conclusions:
- Loops are not always slow
- Reading from the file cache is fast.
- Code with loops are often easier to make and understand (IMO).
- Don't hesitate to use the RAM if it is available
.
The files involved are
function M = cssm()
fid = fopen( 'cssm.txt' );
cup = onCleanup( @() fclose( fid ) );
cac = textscan( fid, '%s', 'Delimiter', '\n', 'HeaderLines', 1 );
nrow = numel( cac{:} );
clear cup
fid = fopen( 'cssm.txt' );
cup = onCleanup( @() fclose( fid ) );
[~] = fgetl( fid );
ncol = 0;
while not( feof( fid ) )
ncol = max( ncol, numel( sscanf( fgetl(fid), '%f' ) ) );
end
clear cup
M = cssm_( nrow, ncol );
end
function M = cssm_( nrow, ncol )
M = nan( nrow, ncol );
fid = fopen( 'cssm.txt' );
cup = onCleanup( @() fclose( fid ) );
[~] = fgetl( fid );
row = 0;
while not( feof( fid ) )
row = row + 1;
val = sscanf( fgetl(fid), '%f' );
M( row, 1:numel(val) ) = val;
end
end
and
function M = cssm1( nrow, ncol )
M = nan( nrow, ncol );
fid = fopen( 'cssm.txt' );
cup = onCleanup( @() fclose( fid ) );
[~] = fgetl( fid );
row = 0;
while not( feof( fid ) )
row = row + 1;
val = sscanf( fgetl(fid), '%f' );
M( row, 1:numel(val) ) = val;
end
M( :, all( isnan( M ), 1 ) ) = [];
M( all( isnan( M ), 2 ), : ) = [];
end
The text file, cssm.txt,contains 2465 line; repetitions of OP's data.