How to speed up Matrix Vector Multipliacation

18 Ansichten (letzte 30 Tage)
Wei HU
Wei HU am 25 Mär. 2018
Kommentiert: Jan am 26 Mär. 2018
I found that for large matrix with matrix multiplication, Matlab will automatically parallel calculate it. ( CPU usage 100%) However, when doing Matrix multiply vector, it seems only one kernel is in use.
Can this operation be done parallelly? For example, I think we could simply divide the large matrix into many pieces according to row. I tried using parfor in Parallel Toolbox, which result in very low performance.
I also tried writing C function using CBLAS, however, as stated in, https://www.mathworks.com/matlabcentral/answers/390546-wrong-result-when-calling-cblas-dgemv-function-in-a-mex-file There are some problems in this implementation.
I wonder if there are other ways of accelerating this? Thanks!
I tried the following parfor code:
N = % num of rows
M = % num of cols
A = rand(N,M);
b = rand(M,1);
c = zeros(N,1);
p = 6;
batch = floor(N / 6);
parfor i = 1:p
istart = (i-1)*batch+1;
iend = i*batch;
if (i==p)
iend=N;
end
A_sliced = A(istart:iend,:);
c(istart:iend) = A_sliced * b;
end
  3 Kommentare
Wei HU
Wei HU am 25 Mär. 2018
Really? I tried like,
N = 20000;
M = 20000;
A = rand(N, M);
b = rand(N,1);
tic
A*b;
toc
It runs very fast,
Elapsed time is 0.109193 seconds.
But still, Matlab only consume about 10% of CPU. I ran on Intel Core i7 6800k, which has totally 6 cores 12 threads, 10% is about 1 core.
John D'Errico
John D'Errico am 25 Mär. 2018
Bearbeitet: John D'Errico am 25 Mär. 2018
First, don't bother with tic and toc. They are poor ways to time anything.
Next, you need to put it in a loop. If the multiply goes so fast that you cannot see the CPUS all coming alive, then you will never know it used them!
I could not even see the CPUs coming active until I wrapped a loop around it.
And as I said in my answer, hyperthreading does not count. You gain nothing of significance from splitting one CPU into two. If a CPU is fully active, splitting it so that you have two CPUS that are both fully active, but only half as capable is a waste of time. So you have 6 cores.
Hyper-threading is great for some applications. But not here.
Put it in a loop. If one multiply took .1 seconds, then do 100 multiplies. Then carefully watch a CPU monitor, as did I.

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

John D'Errico
John D'Errico am 25 Mär. 2018
Bearbeitet: John D'Errico am 25 Mär. 2018
MATLAB automatically multi-threads computations where there will be a clear gain. For a matrix-vector multiply, it apparently does not see a gain, unless the problem is large enough. Remember that parallel computations are not always a speedup, because there is extra overhead. If you can farm it to the GPU directly, you might get a gain though. Since I lack that TB, I cannot help you there.
As a test though, I checked to see if MATLAB will multi-thread a matrix*vector operation.
A = rand(15000);B = rand(15000,1);
I stopped here, waiting for a few seconds until MATLAB went quiescent. To ensure that I was not seeing multithreading happen on the call to rand.
for i = 1:100
C = A*B;
end
The multiply was so fast to do, that I had to add a loop to allow me to see my CPU usage go up to the full 400% possible. So indeed MATLAB does multi-thread a matrix*vector multiply, if it sees a gain.
Make sure you have maxNumCompThreads set to the correct value for your CPU. For me:
maxNumCompThreads
ans =
4
Hyper-threading does not count however.
  1 Kommentar
Jan
Jan am 26 Mär. 2018
+1. Exactly. A*b is multi-threaded already and you can see it, if you run it for a while by a loop. The task manager is not fast enough to see this for a single call only.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Matrix Indexing finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by