Vectorize a loop to save time

Question

Filip am 3 Feb. 2019

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/442981-vectorize-a-loop-to-save-time

Kommentiert: Walter Roberson am 4 Feb. 2019

I have a big data set and my current code takes 2 hours. I am hoping to save time by vectorization if that is possible in my case.

I have a table Table with variables ID, t1, tend, p. My code is sth like:

x=zeros(size(Table.ID,1));
for i=1:size(Table.ID,1)
x(i)=sum(Table.t1<Table.t1(i) & Table.tend>Table.tend(i) & abs(Table.p-Table.p(i))>1);
end

So for each observation, I want to find number of observations that start before, ends after and have a p value in the neighborhood of 1. It takes 2 hours to run this loop. Any suggestion?

Thanks in advance!

2 Kommentare
Keine anzeigenKeine ausblenden

Walter Roberson am 4 Feb. 2019

How are the t1 and tend values arranged? Are tend(i+1) = t1(i) such that together they partition into consecutive ranges that are completely filled between the first and last? Do they act to partition into non-overlapping ranges but with gaps? Are there overlapping regions? Are the boundaries already sorted?

Filip am 4 Feb. 2019

There is no arrangement between t1 and tend values across observations. They might overlap for some observations, there might be gaps in time too.

All I know is that t1<tend for an observation.

Table is sorted wrt ID.

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Jan am 4 Feb. 2019

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/442981-vectorize-a-loop-to-save-time#answer_359428

Bearbeitet: Jan am 4 Feb. 2019

In MATLAB Online öffnen

2 hours sounds long. Is the memory exhausted and the virtual memory slows down the execution? How large is the input?

Is this a typo:

x = zeros(size(Table.ID,1))

It creates a square matrix, but you access it as vector obly.

Does the table access need a remarkable amount of time?

n    = size(Table.ID,1);
t1   = Table.t1;
tend = Table.tend;
p    = Table.p;
x    = zeros(n, 1);
for i = 1:n
  x(i) = sum(t1 < t1(i) & tend > tend(i) & abs(p - p(i)) > 1);
end

If you sort one of the vectors, you could save some time:

[t1s, index] = sort(t1);
tends        = tend(index);
ps           = p(index);
for i = 2:n
  m    = t1s < t1s(i);
  x(i) = sum(tends(m) > tends(i) & ...
             abs(ps(m) - ps(i)) > 1);
end

Afterwards x has to be sorted inversly. If you provide some inputs, I could check the code before posting. I'm tired, perhaps I've overseen an obvious indexing error.

Is the shown code really the bottleneck of the original code?

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Filip am 4 Feb. 2019

I have more variables in the table and do more comparisons, but they are all similar. So, I wrote a sample here to give the idea.

x = zeros(size(Table.ID,1)) is obviously a typo.

I guess, sorting t1 will work, and also accessing table might be time consuming. I will update when I apply the changes but this seems promising. Thanks!

Melden Sie sich an, um zu kommentieren.

Answer 2

Walter Roberson am 4 Feb. 2019

1
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/442981-vectorize-a-loop-to-save-time#answer_359344

In MATLAB Online öffnen

My mind is headed towards creating a pairwise mask matrix,

M = squareform(pdist(Table.p) > 1);    %important that Table.p is a column vector

That would be comparatively fast. If the table is very big then it could fill up memory, though.

abs() is not needed for this; pdist will already have calculated distance as a non-negative number.

Now

Mi = M(i,:);
x(i)=sum(Table.t1(Mi)<Table.t1(i) & Table.tend(Mi)>Table.tend(i));

However you should do timing tests against

Mi = M(i,:);
x(i)=sum(Mi & Table.t1<Table.t1(i) & Table.tend>Table.tend(i));

and

Mi = M(i,:);
Tt = Table(Mi);
x(i)=sum(Tt.t1<Table.t1(i) & Tt.tend>Table.tend(i));

2 Kommentare
Keine anzeigenKeine ausblenden

Filip am 4 Feb. 2019

Unfortunately, this answer does not exactly work. But inspired by your answer, I believe that creating pairwise difference matrix by "bsxfun(@minus, T.t1, T.t1')" might work. I am not sure how faster it is gonna be and if I will have memory issues. I will try and update after.

Walter Roberson am 4 Feb. 2019

In MATLAB Online öffnen

abs(T.t1 - T.t1.')

would work as a distance function for you in R2016b and later.

Melden Sie sich an, um zu kommentieren.

Vectorize a loop to save time

2 Kommentare
Keine anzeigenKeine ausblenden

Akzeptierte Antwort

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Weitere Antworten (1)

2 Kommentare
Keine anzeigenKeine ausblenden

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

Vectorize a loop to save time

2 Kommentare Keine anzeigenKeine ausblenden

Akzeptierte Antwort

1 Kommentar -1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Weitere Antworten (1)

2 Kommentare Keine anzeigenKeine ausblenden

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

2 Kommentare
Keine anzeigenKeine ausblenden

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

2 Kommentare
Keine anzeigenKeine ausblenden