Speed up code by eliminating loop

Question

0 Stimmen

Dear all,

I have to do a maximization of a 3-D array V(e,b,d) over the first-dimension, obtaining something that I call e_opt which will be a b*d matrix, i.e.

e_opt(b,d) = argmax_e {V(e,b,d)}

The non-standard feature is that the grid for e depends on d. So I am not able to eliminate the loop over d and I need to speed up the code. I attach a MWE so that things become clear. On my PC it runs in about 1 secs but this is too slow since I have to do this many times.

Any help is greatly appreciated!

clear
close all
clc
rng("default")
% Size of grids
n_b = 1000;
n_d = 1000;
n_e = 5000;
% Parameters
delta   = 0.2696037674949296; 
%omega_0 = 0.0015534835229589;
%omega_1 = 0.0013023827114628;
% Generate fake data
cost_mat = rand(n_e,n_d);
prob_mat = rand(n_e,n_d);
option_nob = rand(n_b,1);
d_grid = linspace(0,2.3,n_d)';
%% Create a grid for "e". The non-standard feature is that the upper bound
% of the e_grid depends on "d"
e_grid_mat = zeros(n_e,n_d);
e_min = 0;
space = 1.5;
for d_c=1:n_d
    d_val = d_grid(d_c);
    e_max = 0.1+0.2*d_val; %silly example, don't take this literally
    % e_grid is NOT equally spaced
    e_grid_mat(:,d_c) = e_min+(e_max-e_min)*(linspace(0,1,n_e).^space)';
end
%% The code below is the part that I'd like to speed up
tic
e_opt = zeros(n_b,n_d);
for d_c = 1:n_d
    % Effort grid depends on d (upper bound changes, but same no. of elements)
    e_grid = e_grid_mat(:,d_c); %each column of e_grid_mat is different!
    % V(e,b) has dim: (n_e,n_b) and I maximize with respect to the first dimension, e
    V = -cost_mat(:,d_c)-delta*prob_mat(:,d_c).*option_nob';
    [~,max_ind] = max(V,[],1); %maxind is (1,n_b) vector
    e_opt(:,d_c)  = e_grid(max_ind);
end %end d
toc
Elapsed time is 4.942008 seconds.
%To check results
disp(mean(mean(e_opt)))
    0.1285

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Anmelden, um Aktivität zu verfolgen

Answer 1

Alessandro Maria Marco am 3 Jan. 2024

Bearbeitet: Alessandro Maria Marco am 3 Jan. 2024

In MATLAB Online öffnen

0 Stimmen

The vectorized code after eliminating the loop over d is the following. Choose either 'loops' or 'vec' at the beginning of the script to compare the running time of the two methods: loops vs vectorized. It turns out that loops are faster! I found this relatively surprising since loops are generally slow in Matlab.

clear
close all
clc
rng("default")
method = 'loops'; % 'loops','vec'
% Size of grids
n_b = 1000;
n_d = 40;
n_e = 5000;
% Parameters
delta   = 0.2696037674949296; 
%omega_0 = 0.0015534835229589;
%omega_1 = 0.0013023827114628;
% Generate fake data
cost_mat = rand(n_e,n_d);
prob_mat = rand(n_e,n_d);
option_nob = rand(n_b,1);
d_grid = linspace(0,2.3,n_d)';
%% Create a grid for "e". The non-standard feature is that the upper bound
% of the e_grid depends on "d"
e_grid_mat = zeros(n_e,n_d);
e_min = 0;
space = 1.5;
for d_c=1:n_d
    d_val = d_grid(d_c);
    e_max = 0.1+0.2*d_val; %silly example, don't take this literally
    % e_grid is NOT equally spaced
    e_grid_mat(:,d_c) = e_min+(e_max-e_min)*(linspace(0,1,n_e).^space)';
end
%% The code below is the part that I'd like to speed up
tic
switch method
    case 'loops'
        disp('loops')
        e_opt = zeros(n_b,n_d);
        for d_c = 1:n_d
            % Effort grid depends on d (upper bound changes, but same no. of elements)
            e_grid = e_grid_mat(:,d_c); %each column of e_grid_mat is different!
            % V(e,b) has dim: (n_e,n_b) and I maximize with respect to the first dimension, e
            V = -cost_mat(:,d_c)-delta*prob_mat(:,d_c).*option_nob';
            [~,max_ind] = max(V,[],1); %maxind is (1,n_b) vector
            e_opt(:,d_c)  = e_grid(max_ind);
        end %end d
    case 'vec'
        disp('vec')
        % compute the full V, with dim (n_e,n_d,n_b)
        V = -cost_mat-delta*prob_mat.*reshape(option_nob, 1, 1, n_b);
        %V = -cost_mat-delta*prob_mat.*shiftdim(option_nob,-2);
        [~, max_ind] = max(V, [], 1); % max_ind is (1,n_d,n_b)
        max_ind = permute(max_ind, [2 3 1]); % % max_ind is now (n_d,n_b)
        %max_ind = max_ind + (0:n_d-1).'*n_e; % apply linear indexing
        % e_grid_mat is [n_e,n_d], but e_opt gives only "e" indices
        max_ind = sub2ind([n_e,n_d],max_ind,repmat((1:n_d)',1,n_b));
        e_opt = e_grid_mat(max_ind).'; % retrieve maximizing values
end
loops
toc
Elapsed time is 0.263274 seconds.
%To check results 
disp(mean(mean(e_opt)))
    0.1364

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Answer 2

Dinesh am 3 Jan. 2024

In MATLAB Online öffnen

1 Stimme

Hi user,

Considering the dependency of "e_grid" on "d", full vectorization of your code may not be feasible. To optimize the code, you can explore MATLAB's parallel processing capabilities, particularly the "parfor" loop, which can execute iterations concurrently on multi-core processors. Ensure you have the Parallel Computing Toolbox installed and initiate a parallel pool before the loop to take advantage of this feature. This approach can significantly reduce the runtime if your system has multiple cores.

Here's a brief example using "parfor":

parpool;
tic
e_opt = zeros(n_b,n_d);
parfor d_c = 1:n_d
    e_grid = e_grid_mat(:,d_c);
    V = -cost_mat(:,d_c) - delta*prob_mat(:,d_c).*option_nob';
    [~, max_ind] = max(V, [], 1);
    e_opt(:, d_c) = e_grid(max_ind);
end
toc

In addition to parallelization, profile your code to identify other potential inefficiencies. Utilize MATLAB's "profile on" command to get a detailed report on the runtime of different parts of your script.

https://www.mathworks.com/help/parallel-computing/getting-started-with-parallel-computing-toolbox.html

https://www.mathworks.com/help/matlab/matlab_prog/profiling-for-improving-performance.html

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

Alessandro Maria Marco am 3 Jan. 2024

@Dinesh many thanks for your answer!

(1) I tried the parfor and it works (i.e. it gives the same results as the serial code) but it is 50% slower, perhaps due to overhead cost

(2) I had already profiled the code using 'profile on' and this is the bottleneck in my project

I managed to do the vectorization (with the help of another user on another forum) but again it does not beat the serial code with loops. I will post it anyways as an answer since it may be useful to others

Melden Sie sich an, um zu kommentieren.

Speed up code by eliminating loop

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Akzeptierte Antwort

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Weitere Antworten (1)

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

Kategorien

Produkte

Version

Tags

Community Treasure Hunt

Speed up code by eliminating loop

0 Kommentare -2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Akzeptierte Antwort

0 Kommentare -2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Weitere Antworten (1)

1 Kommentar -1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

Kategorien

Produkte

Version

Tags

Siehe auch

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden