Pre-Allocate structure with String / Datetime fields slows code down considerably

Hi,
I am trying to read through and sort two large .txt files, around 300 mb at the largest.
Originally, for each line of code I read, I would re create the matrix lile this:
strarray.full = [strarray.full ; new_info]
strarray.newdate = [strarray.newdate ; new_info ]
This slowed down considerably once the files reached around 20 mb. I've seen that Pre Allocating matrices prevent MATLAB from having to re create the growing matrix each iteration. So now I have the following:
strarray.newdate = NaT(2000000,1);
strarray.full = strings(2000000,1);
where I have a counting varaible ' j ' that counts each time something should be added into the matrix.
strarray.full(j,1) = new_info;
strarray.newdate(j,1) = new_info;
When I did this, the code slowed down considerably, both starting off slower and slowing down faster as time progressed. After running a profiler, it says that nearly all the time is spent putting the info into the pre-allocated matrix.
I've got permission to attach the file, but I cant attach the .txt files directly so I have to strip it down here.
.txt Format 1:
Datetime2 ~ *string* ~ *string* ~ *string*
*string*
Datetime2 ~ *string* ~ *string* ~ *string*
*string*
*string*
*string*
*string*
Datetime2 ~ *string* ~ *string* ~ *string*
*string*
*string*
.txt Format 2:
datetime1 ~ *string* ~~~ *string* ~~~ *string* ~*~
datetime1 ~ *string* ~~~ *string* ~~~ *string* ~*~
datetime1 ~ *string* ~~~ *string* ~~~ *string* ~*~
Thanks.

 Akzeptierte Antwort

You are not using struct array. You are putting newdate (which is a datetime array) and full (which is a string array) into a struct strarray (see code difference below). In this case, I wonder if you just use newdate=NaT(2e6,1) and full=strings(2e6,1) directly would be faster. After all, combine these two big array into one struct won't help at all.
You can try struct array following the below pattern to see if it helps. I doubt it.
s1.newdate=NaT(20,1);
s1.newdate(1)
s1.newdate(20)
s2(20).newdate=NaT;
s2(1).newdate
s2(20).newdate

5 Kommentare

Hi,
Making them their own variables seemed to fix my original problem so thank you!
But now it seemed to create a new one.
Running a profiler again showed that it takes effectively no time to input the " full' variable now. But now its taking even longer for a separate section of the code:
newdate(j,1) = datetime(datecurrent(1),'InputFormat',traceform, 'Format', newform);
For this script, I need to scan each line and re arrange some of the dates and then put it into my newdate variable.
Why would this now become an issue?
Profile will always lead you to the next biggest time comsumer. For this line of code, I don't know if there is any way to improve it, if newdate is already pre-allocated.
use newdate(j), not to use newdate(j,1) since it is a vector. Does this make a difference?
See how much time is spent on the built-in function datetime()
Ok, here is what im using now.
N = 2010000;
full = strings(N,1);
newdate = NaT(N,1);
And
datefill = datetime(datecurrent(1),'InputFormat',Logform, 'Format', newform);
newdate(j) = datefill;
Using the profiler it shows that jewdate(j) = datefill was by far the biggest problem.
Why did this only fix the "Full" variable and not the "datetime" variable? Should I avoid storing anything as datetime?
Not sure if it has anything to do with NaT(). Could you try pre-allocate it in either of this two ways?
newdate=zeros(N,1);
newdate=repmat(datetime,N,1);
Yes, Ive tried it with zeros, but then i get an error converting from datetime to double. Converting datetime variable to a string and then putting it into a empty string vector fixed the issue,
I ran a test for ~2000 secs, and around 1800 secs were from just the datetime() function and not putting it into the array.
I think I've got some pathways forward now.
Thank you so much!

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Produkte

Version

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by