2 views (last 30 days)

%I'm trying to work with 4 dimensional cells containing matrices that need multiplying together.

%I have three matrices, A_1, A_2 & B, which are multiplied together to get Y. Each matrix - A_1, A_2 & B has different values depending on frequency, f.

%I've done this the long way first - with three separate frequency vectors: -

%% 1ST METHOD

f1 = 20

f2 = 500

f3 = 10000

%Then building the matrices, A_1, A_2 & B: -

A_1_f1 = [(f1 + 0) (f1 + 1);(f1 + 2) (f1 + 3)]

A_1_f2 = [(f2 + 0) (f2 + 1);(f2 + 2) (f2 + 3)]

A_1_f3 = [(f3 + 0) (f3 + 1);(f3 + 2) (f3 + 3)]

A_2_f1 = [(f1 + 1) (f1 + 2);(f1 + 3) (f1 + 4)]

A_2_f2 = [(f2 + 1) (f2 + 2);(f2 + 3) (f2 + 4)]

A_2_f3 = [(f3 + 1) (f3 + 2);(f3 + 3) (f3 + 4)]

B1 = [(f1 + 6);(f1 + 7)]

B2 = [(f2 + 6);(f2 + 7)]

B3 = [(f3 + 6);(f3 + 7)]

%And simple multiplication: -

Y1 = A_1_f1*A_2_f1*B1

Y2 = A_1_f2*A_2_f2*B2

Y3 = A_1_f3*A_2_f3*B3

%RESULTS: -

%Y1 = [48966;53738],

%Y2 = [509543046;511579178],

%Y3 = [4.0038e+12;4.0046e+12]

%THAT'S THE 1ST LONG WAY!

%% 2ND METHOD

%Then I've successfully reduced this so that f1, f2 & f3 become the vector, f: -

f = [20 500 10000]

A_1(1,1,:) = f + 0

A_1(1,2,:) = f + 1

A_1(2,1,:) = f + 2

A_1(2,2,:) = f + 3

A_2(1,1,:) = f + 1

A_2(1,2,:) = f + 2

A_2(2,1,:) = f + 3

A_2(2,2,:) = f + 4

B(1,1,:) = f + 6

B(2,1,:) = f + 7

ref(1,1,:) = f

ref(2,1,:) = f %(this is just a reference matrix for the next part)

Y = arrayfun(@(ind)A_1(:,:,ind)*A_2(:,:,ind)*B(:,:,ind)...

,1:size(ref,3),'uniformOutput',false);

Y = cat(3, Y{:});

%RESULS: -

%Y =

% val(:,:,1) =

% 48966

% 53738

% val(:,:,2) =

% 509543046

% 511579178

% val(:,:,3) =

% 4.0038e+12

% 4.0046e+12

%THAT'S THE MEDIUM - STILL POTENTIALLY LONG WAY - DEPENDING ON HOW MANY 'A' MATRICES THERE ARE.

%% 3RD METHOD

%This is good, but now I need to introduce more 'A' matrices, like A_3, A_4, A_5 etc.

%I've tried using 4-D cells, so that A_1, A_2, A_3, A_4... become just A, like this: -

A(1,1,1,:) = f + 0

A(1,2,1,:) = f + 1

A(2,1,1,:) = f + 2

A(2,2,1,:) = f + 3

A(1,1,2,:) = f + 1

A(1,2,2,:) = f + 2

A(2,1,2,:) = f + 3

A(2,2,2,:) = f + 4

%but that's where I get stuck. I'm thought a for loop method would work, but I couldn't get it work. I basically don't know how to multiply these out so I get the same results as above.

%If anyone knows how to do this that would be great - many thanks in advance.

Cedric Wannaz
on 5 Apr 2013

Edited: Cedric Wannaz
on 5 Apr 2013

As you have only a limited number of values for f, you should avoid using complicated 3D or 4D arrays (indexing them is not always faster than looping indexing based on simpler 2D arrays), or solutions based on ARRAYFUN, and go for a much simpler solution based on a cell array for storing Y's.

Also, your matrices are quite simple so you could define them in a simpler way that you are doing above, i.e. you seem to have (for any given value of f):

A1 = f + [0 1; 2 3]

A2 = A1 + 1

B = f + [6; 7]

Using a cell array for storing results Y, you could implement the following:

f = [20 500 10000] ;

n = numel(f) ;

Y = cell(n,1) ;

for k = 1 : n

A1 = f(k) + [0 1; 2 3] ;

Y{k} = A1 * (A1 + 1) * (f(k) + [6; 7]) ;

end

Solutions could be then accessed as follows:

Y{1}

ans =

48966

53738

Y[2}

...

etc

Note that this solutions is flexible, as the size of the cell array and the upper bound of the loop adapt to the number of elements in the f vector. Also, I assumed that you just need Y's in the end and not successive values of A1, A2, and B. You could use a few extra cell arrays if you needed to store them.

Cedric Wannaz
on 7 Apr 2013

No problem; thank you for the full code, it is easier to understand now. Do you know how to use the profiler in MATLAB? If not, type

profile viewer

in the command window; it will open a profiling tool. Assuming that the current path is the folder that contains your script/M-file, type the name of this script in the field labeled "Run this code" at the top of the profiling window (or the function call if it is a function), and click on [Start Profiling]. You will obtain a report after the run and you can click on your script name in the report table to get the detail.

You will see that more than 60% of the time is spent on the line

Z_DuMX(:,:,1,l) = Z_DuMX(:,:,1,l)*Z_DuMXi(:,:,m + 1,l);

and that most of the rest of the time is spent in the loop that defines Z_DuMXi. These are therefore what needs to be optimized in priority.

About the line defining Z_DuMX, at each iteration you are computing a matrix product of two 2x2 matrices.. is it correct? This would be difficult to optimize, and might require that you work with arrays of complex numbers instead of arrays of real and imaginary parts.

To optimize the block defining Z_DuMXi, you can vectorize calls to SIN/COS as follows (you'll have to test that it really works the way you want though, and fine tune if necessary): after the block

for i = 1:n_f

for j = 1:n_Du

Z_DuMXi(1,1,j,i) = cos(2*pi*f(i)*x_Du/343);

Z_DuMXi(1,2,j,i) = sin(2*pi*f(i)*x_Du/343)*1i*415/S_Du(j);

Z_DuMXi(2,1,j,i) = sin(2*pi*f(i)*x_Du/343)*1i*S_Du(j)/415;

Z_DuMXi(2,2,j,i) = cos(2*pi*f(i)*x_Du/343);

end

end

insert

[f_mesh, S_Du_mesh] = meshgrid(f, S_Du) ;

k_mesh = 2*pi*f_mesh*x_Du/343 ;

Z_DuMXi2(1,1,:,:) = cos(k_mesh);

Z_DuMXi2(2,2,:,:) = Z_DuMXi2(1,1,:,:) ;

Z_DuMXi2(1,2,:,:) = sin(k_mesh)*1i*415./S_Du_mesh;

Z_DuMXi2(2,1,:,:) = sin(k_mesh)*1i.*S_Du_mesh/415;

all(Z_DuMXi(:) == Z_DuMXi2(:)) ; % Test.

The call to ALL is testing whether Z_DuMXi2's elements are equal to Z_DuMXi's. It outputs 1 (true), which seems to indicate that the vectorized version works. Once you are sure that it works, you can replace the first version with the second, and rename Z_DuMXi2 into Z_DuMXi.

If you time the two blocks using TIC/TOC, you will see that the computation time goes from about 21s for executing the first nested loop to less than 1s for executing the vectorized version.

Now about the general approach, 4D arrays are not that easy to work with and they are much slower computing-wise than 2D arrays. The reason is that they are cut into frames that are not stored at contiguous locations in memory. Sometimes, just permuting dimensions so you work on the first two dimensions instead of e.g. 1 and 3, will reduce the computation time by a factor 1000. This to say that once you'll have vectorized everything (if possible), you'll hit a wall that will be defined by indexing operations in arrays that are larger than 2D.

Cedric Wannaz
on 9 Apr 2013

Sign in to comment.

Sign in to answer this question.

Opportunities for recent engineering grads.

Apply Today
## 0 Comments

Sign in to comment.