# How to speed up Matrix Vector Multipliacation

26 views (last 30 days)

Show older comments

I found that for large matrix with matrix multiplication, Matlab will automatically parallel calculate it. ( CPU usage 100%) However, when doing Matrix multiply vector, it seems only one kernel is in use.

Can this operation be done parallelly? For example, I think we could simply divide the large matrix into many pieces according to row. I tried using parfor in Parallel Toolbox, which result in very low performance.

I also tried writing C function using CBLAS, however, as stated in, https://www.mathworks.com/matlabcentral/answers/390546-wrong-result-when-calling-cblas-dgemv-function-in-a-mex-file There are some problems in this implementation.

I wonder if there are other ways of accelerating this? Thanks!

I tried the following parfor code:

N = % num of rows

M = % num of cols

A = rand(N,M);

b = rand(M,1);

c = zeros(N,1);

p = 6;

batch = floor(N / 6);

parfor i = 1:p

istart = (i-1)*batch+1;

iend = i*batch;

if (i==p)

iend=N;

end

A_sliced = A(istart:iend,:);

c(istart:iend) = A_sliced * b;

end

##### 3 Comments

John D'Errico
on 25 Mar 2018

Edited: John D'Errico
on 25 Mar 2018

First, don't bother with tic and toc. They are poor ways to time anything.

Next, you need to put it in a loop. If the multiply goes so fast that you cannot see the CPUS all coming alive, then you will never know it used them!

I could not even see the CPUs coming active until I wrapped a loop around it.

And as I said in my answer, hyperthreading does not count. You gain nothing of significance from splitting one CPU into two. If a CPU is fully active, splitting it so that you have two CPUS that are both fully active, but only half as capable is a waste of time. So you have 6 cores.

Hyper-threading is great for some applications. But not here.

Put it in a loop. If one multiply took .1 seconds, then do 100 multiplies. Then carefully watch a CPU monitor, as did I.

### Accepted Answer

John D'Errico
on 25 Mar 2018

Edited: John D'Errico
on 25 Mar 2018

MATLAB automatically multi-threads computations where there will be a clear gain. For a matrix-vector multiply, it apparently does not see a gain, unless the problem is large enough. Remember that parallel computations are not always a speedup, because there is extra overhead. If you can farm it to the GPU directly, you might get a gain though. Since I lack that TB, I cannot help you there.

As a test though, I checked to see if MATLAB will multi-thread a matrix*vector operation.

A = rand(15000);B = rand(15000,1);

I stopped here, waiting for a few seconds until MATLAB went quiescent. To ensure that I was not seeing multithreading happen on the call to rand.

for i = 1:100

C = A*B;

end

The multiply was so fast to do, that I had to add a loop to allow me to see my CPU usage go up to the full 400% possible. So indeed MATLAB does multi-thread a matrix*vector multiply, if it sees a gain.

Make sure you have maxNumCompThreads set to the correct value for your CPU. For me:

maxNumCompThreads

ans =

4

Hyper-threading does not count however.

##### 1 Comment

Jan
on 26 Mar 2018

### More Answers (0)

### See Also

### Categories

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!