Perform specific operations on arrays without "for" loop

3 views (last 30 days)
Luca Romagnuolo
Luca Romagnuolo on 16 May 2022
Commented: Luca Romagnuolo on 22 May 2022
Hi! I'm trying to perform some particular operations between arrays in a more efficient way, possibly without using the "for" loop. Here is the function I would like to optimize:
function [chi,zeta,M,O,Delta,rho] = calc_fun(N,n,x,x_avg,omega,xi)
chi=cumsum(x.*cos((1:N)'*omega));
zeta=cumsum(x.*sin((1:N)'*omega));
M=zeros(floor(N/n),1);
for m=1:floor(N/n)
M_term = (chi(m+1:N)-chi(1:N-m)).^2 + (zeta(m+1:N)-zeta(1:N-m)).^2;
M(m) = sum(M_term)/(N-m);
end
O = (x_avg)^2 * (1-cos(xi.*omega)/(1-cos(omega)));
Delta = M - O;
covariance=cov(xi,Delta,1);
rho = covariance(1,2) / sqrt(var(xi,1)*var(Delta,1));
end
Usually, N is more than 200'000 and n=200; x is the input time series, an array of N values, and omega is an element of an array of 512 values.
This function should be executed for each omega value and for different time series (at least 15), which brings the for loop in the function to be executed more than 7.5 million times (with my pc, it takes around 5 h to complete the job).
It seems to me that most of the time is spent for calling different portions of the vectors chi and zeta for different m values, but I don't know how to call them without the use of a foor loop. Is there a better/faster way to execute this calculation? Any suggestions would be helpful!
Here is my full code (function at the bottom):
close all; clear; clc
%% INPUT
% input parameters
n_series=15; % number of time series to be analyzed
x_0=0.5; % initial value of the sequence, x_O in (0,1)
f_s=51200; % frequency of acquisition
f_delta=50; % delta frequency of analysis
n=200; % reduction for M
seconds=4;
% parameters definition
time=(0:1/f_s:seconds)';
N=size(time,1);
f=(f_delta:f_delta:f_s/2)'; % frequencies to be analyzed
omega=2*pi*f/f_s;
omegaSize=size(omega,1);
xi=(1:floor(N/n))';
z(1:n_series)=struct('time',time,'x',[],'x_avg',[],'chi',[],'chi_avg',[],...
'zeta',[],'zeta_avg',[],'radius',[],'M',[],'O',[],'Delta',[],'rho',[]);
% test sequence definition (dummy sequence, usually replaced by real time series)
mu=linspace(1.1,4,n_series); % parameter of the sequence, mu in (1,4]
data=zeros(N,n_series);
for i=1:n_series
data(1,i)=x_0;
for j=2:N
data(j,i)=mu(i)*data(j-1,i)*(1-data(j-1,i));
end
end
%% MAIN RUN
tic
for i=1:n_series
i_time=tic;
% initialization
x=data(:,i);
x_avg=mean(x);
chi=zeros(N,omegaSize); zeta=zeros(N,omegaSize);
M=zeros(floor(N/n),omegaSize); O=zeros(floor(N/n),omegaSize);
Delta=zeros(floor(N/n),omegaSize); rho=zeros(1,omegaSize);
% calculation
for w=1:omegaSize
w_time=tic;
[chi(:,w),zeta(:,w),M(:,w),O(:,w),Delta(:,w),rho(1,w)]=...
calc_fun(N,n,x,x_avg,omega(w),xi);
w_time_end=toc(w_time);
fprintf('SERIES %u of %u - END of frequency %u Hz (%u of %u) - time: %.3f\n',...
i,n_series,f(w),w,omegaSize,w_time_end);
end
chi_avg=mean(chi);
zeta_avg=mean(zeta);
radius_term=(chi-chi_avg).^2 + (zeta-zeta_avg).^2;
radius=sqrt(1/(N-1) * sum(radius_term));
% data saving
z(i).x=x; z(i).x_avg=x_avg;
z(i).chi=chi; z(i).chi_avg=chi_avg;
z(i).zeta=zeta; z(i).zeta_avg=zeta_avg; z(i).radius=radius;
z(i).M=M; z(i).O=O; z(i).Delta=Delta; z(i).rho=rho;
i_time_end=toc(i_time);
fprintf('END of SERIES %u of %u - time: %.3f\n',i,n_series,i_time_end);
end
clear x x_avg chi chi_avg zeta zeta_avg radius_term radius M_term M O Delta rho
toc
%% FUNCTION
function [chi,zeta,M,O,Delta,rho] = calc_fun(N,n,x,x_avg,omega,xi)
chi=cumsum(x.*cos((1:N)'*omega));
zeta=cumsum(x.*sin((1:N)'*omega));
M=zeros(floor(N/n),1);
for m=1:floor(N/n)
M_term = (chi(m+1:N)-chi(1:N-m)).^2 + (zeta(m+1:N)-zeta(1:N-m)).^2;
M(m) = sum(M_term)/(N-m);
end
O = (x_avg)^2 * (1-cos(xi.*omega)/(1-cos(omega)));
Delta = M - O;
covariance=cov(xi,Delta,1);
rho = covariance(1,2) / sqrt(var(xi,1)*var(Delta,1));
end
Thanks in advance for any suggestions!

Accepted Answer

Jan
Jan on 16 May 2022
Edited: Jan on 16 May 2022
You do not want less for loops, but more: Accesing the RAM is slower than using the CPU cache:
function [chi,zeta,M,O,Delta,rho] = calc_fun2(N,n,x,x_avg,omega,xi)
chi = cumsum(x.*cos((1:N)'*omega));
zeta = cumsum(x.*sin((1:N)'*omega));
q = floor(N / n);
M = zeros(q, 1);
for m = 1:q
M2 = 0;
for k = m+1:N
M2 = M2 + (chi(k) - chi(k - m)) .^ 2 + ...
(zeta(k) - zeta(k - m)) .^ 2;
end
M(m) = M2 / (N-m);
end
O = x_avg^2 * (1 - cos(xi.*omega) / (1-cos(omega)));
Delta = M - O;
C = cov(xi, Delta, 1);
rho = C(1,2) / sqrt(var(xi,1) * var(Delta,1));
end
When I run with n_series=1 and for w=1:5 (instead of 1:omegaSize) the original code needs 16.7 s and the loop version only 6.0 s in Matlab R2018b on my i5 mobile.
Using the parallel processing toolbox on the 2 cores the time is reduced to 4.5 s:
parfor w=1:5 % instead of 1:omegaSize
27% of the original processing time. With more cores the advantage shoudl be larger.
Alternatively run the parallel loop inside the function:
parfor m = 1:q
This has the same speed.
  4 Comments
Luca Romagnuolo
Luca Romagnuolo on 22 May 2022
Yes, I did so in my previous run, as you suggested. I didn't know about these 5 different ways of saving and accessing variables, really interesting! Thanks for all your support!

Sign in to comment.

More Answers (0)

Products


Release

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by