How to speed up Matrix Vector Multipliacation

26 views (last 30 days)
I found that for large matrix with matrix multiplication, Matlab will automatically parallel calculate it. ( CPU usage 100%) However, when doing Matrix multiply vector, it seems only one kernel is in use.
Can this operation be done parallelly? For example, I think we could simply divide the large matrix into many pieces according to row. I tried using parfor in Parallel Toolbox, which result in very low performance.
I also tried writing C function using CBLAS, however, as stated in, https://www.mathworks.com/matlabcentral/answers/390546-wrong-result-when-calling-cblas-dgemv-function-in-a-mex-file There are some problems in this implementation.
I wonder if there are other ways of accelerating this? Thanks!
I tried the following parfor code:
N = % num of rows
M = % num of cols
A = rand(N,M);
b = rand(M,1);
c = zeros(N,1);
p = 6;
batch = floor(N / 6);
parfor i = 1:p
istart = (i-1)*batch+1;
iend = i*batch;
if (i==p)
iend=N;
end
A_sliced = A(istart:iend,:);
c(istart:iend) = A_sliced * b;
end
  3 Comments
John D'Errico
John D'Errico on 25 Mar 2018
Edited: John D'Errico on 25 Mar 2018
First, don't bother with tic and toc. They are poor ways to time anything.
Next, you need to put it in a loop. If the multiply goes so fast that you cannot see the CPUS all coming alive, then you will never know it used them!
I could not even see the CPUs coming active until I wrapped a loop around it.
And as I said in my answer, hyperthreading does not count. You gain nothing of significance from splitting one CPU into two. If a CPU is fully active, splitting it so that you have two CPUS that are both fully active, but only half as capable is a waste of time. So you have 6 cores.
Hyper-threading is great for some applications. But not here.
Put it in a loop. If one multiply took .1 seconds, then do 100 multiplies. Then carefully watch a CPU monitor, as did I.

Sign in to comment.

Accepted Answer

John D'Errico
John D'Errico on 25 Mar 2018
Edited: John D'Errico on 25 Mar 2018
MATLAB automatically multi-threads computations where there will be a clear gain. For a matrix-vector multiply, it apparently does not see a gain, unless the problem is large enough. Remember that parallel computations are not always a speedup, because there is extra overhead. If you can farm it to the GPU directly, you might get a gain though. Since I lack that TB, I cannot help you there.
As a test though, I checked to see if MATLAB will multi-thread a matrix*vector operation.
A = rand(15000);B = rand(15000,1);
I stopped here, waiting for a few seconds until MATLAB went quiescent. To ensure that I was not seeing multithreading happen on the call to rand.
for i = 1:100
C = A*B;
end
The multiply was so fast to do, that I had to add a loop to allow me to see my CPU usage go up to the full 400% possible. So indeed MATLAB does multi-thread a matrix*vector multiply, if it sees a gain.
Make sure you have maxNumCompThreads set to the correct value for your CPU. For me:
maxNumCompThreads
ans =
4
Hyper-threading does not count however.
  1 Comment
Jan
Jan on 26 Mar 2018
+1. Exactly. A*b is multi-threaded already and you can see it, if you run it for a while by a loop. The task manager is not fast enough to see this for a single call only.

Sign in to comment.

More Answers (0)

Categories

Find more on Get Started with Parallel Computing Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by