how to optimize this nested for loop

Question

Fariba Jangjoo am 14 Dez. 2017

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/372965-how-to-optimize-this-nested-for-loop

Kommentiert: Jan am 23 Feb. 2018

Akzeptierte Antwort: Jan

In MATLAB Online öffnen

I have 2 nested loops which do the following:

Get two rows of a matrix
Check if indices meet a condition or not
If they do: calculate xcorr between the two rows and put it into new vector
Find the index of the maximum value of sub vector and replace element of LAG matrix with this value

I dont know how I can speed this code up by vectorizing or otherwise.

b=size(data,1);
F=size(data,2);
LAG= zeros(b,b);  
for i=1:b
    for j=1:b
        if j>i
            x=data(i,:);
            y=data(j,:);
            d=xcorr(x,y);
            d=d(:,F:(2*F)-1); 
            [M,I] = max(d);
            LAG(i,j)=I-1;
            d=xcorr(y,x);
            d=d(:,F:(2*F)-1);
            [M,I] = max(d);
            LAG(j,i)=I-1;
        end
    end
end

2 Kommentare
Keine anzeigenKeine ausblenden

Jan am 14 Dez. 2017

What does "0 down vote favorite" mean?

Walter Roberson am 17 Dez. 2017

It means the question was posted on StackOverflow and copied to here.

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Jan am 14 Dez. 2017

3
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/372965-how-to-optimize-this-nested-for-loop#answer_296274

Bearbeitet: Jan am 17 Dez. 2017

In MATLAB Online öffnen

Just some tiny modifications:

b   = size(data,1);
F   = size(data,2);
LAG = zeros(b, b); 
for i = 1:b
   x = data(i,:);
   for j = i+1:b
      y        = data(j,:);
      d        = xcorr(x,y);
      [~,I]    = max(d(F:(2*F)-1));
      LAG(i,j) = I-1;
        d        = xcorr(y,x);
        [~,I]    = max(d(F:(2*F)-1));
        LAG(j,i) = I-1;
    end
  end

It is a waste of time to calculate the complete xcorr, of you use (F:end) of the output only. Maybe you can create your own xcorr, which calculates the wanted part only. As far as I can see, the maxlag value will not help directly.

[EDITED] Modify the xcorr() function to be leaner:

function ck = leanXCorr(x, y)
len = length(x);
c = [zeros(1,len-1), x];
d = [y, zeros(1,len-1)];
% Compute FFTs
X1 = fft(c);
X2 = fft(d);
% Compute cross correlation
X  = X1 .* conj(X2);
ck = ifft(X);
end

[EDITED 2] This was a bad idea. It works faster for input vectors of size 1x100, but the padding with zeros is inefficient for the FFT. Better use a window length based on NEXTPOW2:

function ck = leanXCorr(x, y)
m   = length(x);
mxl = m - 1;
m2  = 2 ^ nextpow2(2*m - 1);
% Compute FFTs
X1 = fft(x, m2, 2);
X2 = fft(y, m2, 2);
% Compute cross correlation
c  = ifft(X1 .* conj(X2), [], 2);
ck = [c((m2 - mxl + 1):m2), c(1:mxl + 1)];
end

Now I use input: data = rand(15, 41200) and get 3.66 sec instead of 3.82 with the built-in xcorr. Not impressing. But the next step is the actual improvement: If you inline the code into the loop, you can omit the repeated FFT of x and y:

b   = size(data, 1);
F   = size(data, 2);
LAG = zeros(b, b);
m   = F;
mxl = m - 1;
m2  = 2 ^ nextpow2(2*m - 1);
for i = 1:b
   X1 = fft(data(i, :), m2, 2);
   for j = i+1:b
      X2  = fft(data(j, :), m2, 2);
      X12 = X1 .* conj(X2);
        c   = ifft(X12, [], 2);
        d   = [c((m2 - mxl + 1 + F):m2), c(1:mxl+1)];      
        [~, I]   = max(d);
        LAG(i,j) = I - 1;
        c  = ifft(conj(X12), [], 2);
        d  = [c((m2 - mxl + 1 + F):m2), c(1:mxl+1)];
        [~, I]   = max(d);
        LAG(j,i) = I - 1;
     end
  end

This inlined version needs 2.14 sec instead of 3.81 of the original.

Unfortunately this still calculates fft(data(j, :), m2, 2) for the same values. If you could store fft(data, m2, 2), the code would run much faster. But your original data need more than 8GB of RAM. But on a 64GB machine, this would be faster:

FM = fft(data.', m2, 1);
for i = 1:b
   X1 = FM(:,i); % fft(data(i, :), m2, 2);
   for j = i+1:b
      % X2  = fft(data(j, :), m2, 2);
      X2 = FM(:,j);
      X12 = X1 .* conj(X2);
        c   = ifft(X12, [], 1);
        d   = [c((m2 - mxl + 1 + F):m2); c(1:mxl+1)];      
        [~, I]   = max(d);
        LAG(i,j) = I - 1;
        c  = ifft(conj(X12), [], 1);
        d  = [c((m2 - mxl + 1 + F):m2); c(1:mxl+1)];
        [~, I]   = max(d);
        LAG(j,i) = I - 1;
     end
  end

I have transposed the input, because it is faster to access the data in columnwise order.

7 Kommentare
5 ältere Kommentare anzeigen5 ältere Kommentare ausblenden

Fariba Jangjoo am 23 Feb. 2018

In MATLAB Online öffnen

thanks a lot,i got it..i can not understand the function of

m2 = 2 ^ nextpow2(2*m - 1);

and also

d = [c((m2 - mxl + 1 + F):m2), c(1:mxl+1)];

i will be so thankful if you'd mind explaining these two lines for me(really sorry:( )

Jan am 23 Feb. 2018

In MATLAB Online öffnen

The FFT works more efficient, if the n-point DFT uses a power of 2. Therefore:

m2 = 2 ^ nextpow2(2*m - 1);

gets such a value. Instead of padding the data with zeros, a larger window length is used. The result is the same (compare [EDITED] with [EDITED2]), but the speed is higher. But now the output after IFFT is sorted differently, such that

d = [c((m2 - mxl + 1 + F):m2), c(1:mxl+1)];

picks out the elements exactly as if they have been created using the padding.

Simply try the two versions of leanXCorr() posted above:

function ck = leanXCorr1(x, y)
len = length(x);
c = [zeros(1,len-1), x];
d = [y, zeros(1,len-1)];
% Compute FFTs
X1 = fft(c);
X2 = fft(d);
% Compute cross correlation
X  = X1 .* conj(X2);
ck = ifft(X);
end

and

function ck = leanXCorr2(x, y)
m   = length(x);
mxl = m - 1;
m2  = 2 ^ nextpow2(2*m - 1);
% Compute FFTs
X1 = fft(x, m2, 2);
X2 = fft(y, m2, 2);
% Compute cross correlation
c  = ifft(X1 .* conj(X2), [], 2);
ck = [c((m2 - mxl + 1):m2), c(1:mxl + 1)];
end

with some small input. Step through the code line by line using the debugger and compare the contents of the variables.

Melden Sie sich an, um zu kommentieren.

how to optimize this nested for loop

2 Kommentare
Keine anzeigenKeine ausblenden

Akzeptierte Antwort

7 Kommentare
5 ältere Kommentare anzeigen5 ältere Kommentare ausblenden

Weitere Antworten (0)

Siehe auch

Kategorien

Tags

Community Treasure Hunt

how to optimize this nested for loop

2 Kommentare Keine anzeigenKeine ausblenden

Akzeptierte Antwort

7 Kommentare 5 ältere Kommentare anzeigen5 ältere Kommentare ausblenden

Weitere Antworten (0)

Siehe auch

Kategorien

Tags

Community Treasure Hunt

2 Kommentare
Keine anzeigenKeine ausblenden

7 Kommentare
5 ältere Kommentare anzeigen5 ältere Kommentare ausblenden