MATLAB Answers

Fastest Way to write data to a text file - fprintf

112 views (last 30 days)
Brian
Brian on 2 Aug 2013
I am writing a lot of date to a text file one line at a time (1.7 million rows, 4 columns) that is comprised of different data types. I'm wondering if there is a better way to do this than 1 line at a time that might yield much faster results.
Here is what I'm doing now.
ExpSymbols = Char Array
ExpDates = Numeric Array
MyFactor = Numeric Array
FctrName = Char Array
ftemp = fopen('FileName','w' );
for i = 1:length(MyFactor)
fprintf(ftemp, '%s,%i,%f,%s\r\n',ExpSymbols(i,:), ExpDates(i,1), MyFactor(i,1),[FctrName '_ML']);
end
fclose(ftemp);
Thanks in advance,
Brian

Accepted Answer

Jan
Jan on 2 Aug 2013
You can try to suppress the flushing by opening the file in the 'W' instead of the 'w':
ftemp = fopen('FileName', 'W'); % uppercase W
Fmt = ['%s,%i,%f,', FctrName '_ML\r\n'];
for i = 1:length(MyFactor)
fprintf(ftemp, Fmt, ExpSymbols(i,:), ExpDates(i), MyFactor(i));
end
fclose(ftemp);
  9 Comments
dpb
dpb on 5 Aug 2013
A) Can you offload the formatting from this code to a second one that processes the .mat files and writes the formatted ones? Won't save any overall but moves it to a different place where the bottleneck might not be so evident? For example, you could have a second background process doing that conversion while the primary analyses are done interactively? All depends on the actual workflow as to whether helps or not, of course.
B) Can your target app read the data variables sequentially one after the other instead of all a record at a time as you're currently writing them? If so, sure you can write each w/o any loop at all and it will likely be faster by at least a measurable amount as Jan suggests.
C) You might just see what the text option of save does in comparison for speed--don't know it'll help but what they hey...

Sign in to comment.

More Answers (1)

dpb
dpb on 2 Aug 2013
Edited: dpb on 3 Aug 2013
It's a pita for mixed fields--I don't know of any clean way to mix them in fprintf c
I generally build the string array internally then write the whole thing...
cma=repmat(',',length(dates),1); % the delimiter column
out=[symb cma num2str(dates) cma factor cma names];
fprintf(fid, '%s\n', out);
fid=fclose(fid);
names is a placeholder for the FactorName that I guess may be a constant? If so, it can be inserted into the format string as Jan assumed; if not needs to be built as the column of commas to concatenate however it should be.
  6 Comments
dpb
dpb on 5 Aug 2013
Also called "binary". It's unformatted i/o which has the benefits for speed of
a) full precision for float values at minimum number of bytes/entry, b) eliminates the format conversion overhead on both input and output
doc fwrite % and friends
or if could stay in Matlab then
doc save % and load is only slightly higher-level
The possible disadvantage is, of course, you can't just look at a file and read it; but who's going to manually be looking at such large files, anyway?

Sign in to comment.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by