My mex file is slower than my original matlab equivalent

3 views (last 30 days)
Hello friends,
I need to calculate some quantities of linear algeibra type, so they are merely matrix and vector products. The following is an example
EZ=[(1.0./Ds0.^2.*(Ds0.*(Dm0.*4.0+Dm0.*Ds1.^2.*2.0-Ds0.*(Ds1.*2.0+Dm0.*Ds2.*2.0+Dm1.*Ds1.*2.0-Dm2.*Ds0)+Dm0.*Dm1.*2.0)-Dm0.^2.*Ds1.*2.0))./4.0;
(1.0./Ds0.^2.*(Ds0.*(Ds0.*(Dm1.*4.0+Ds1.^2-Ds0.*Ds2.*2.0+4.0)-Dm0.*Ds1.*8.0)+Dm0.^2.*4.0))./4.0;
(Dm0.*6.0-Ds0.*Ds1.*3.0)./(Ds0.*2.0)];
where Ds0,Ds1,Ds2,Dm0,Dm1,Dm2 are 1*n vectors. When I do the calculations using matlabFunction (attached) it is fast. However, I am not satisfied since I really need to do such calculations thousands of time s(if not millions of times). To overcome this issue I decided to give mex a try. Unfortunately, the equivalent mex file (which I made by matlab coder) is slower 2-3 times (I could not upload it here, unforetunately).
Is there any hope to create a mex file out of this function which is much faster? I hope so!
Thanks for your help in advance,
Babak
  7 Comments
Mohammad Shojaei Arani
Mohammad Shojaei Arani on 18 Jul 2022
Bruno,
Of course, simplification matters a lot. My actual expressions are way longer than this. Matlab is not able to simplify them in an efficint way (and in many cases it simplifies a little). I have spent a lot of time on how to simplify my expressions. Unfortunately, using matlab I do not have any hope to simplify my expressions more than this (yes, you can perhaps simplify this expression more because it is not extremely long but can you do it for an expression which is 1KM long???) My expressions are in rational form. So, typically I perform 2 operations to simplify them: 1) first I apply [n,d] = numden(EZ), and then 2) EZ = horner(n,Ds0)./horner(d,Ds0). Unfortunately, matlab does support a multivariate horner scheme and I could only benifit the univariate horner scheme here (I apply horner scheme with respect to variable Ds0 as it is the most repeated variable. Typically, you should apply horner scheme with respect to such variables). So, at this point I convinced myself that I canot hope to simplify my expressions more using matlab. Therefore, I should find strategies to ask C or C++ to perform the calculations.
So, my question is not about how to come up with a better simplification (as it does not work with the current capacities of matlab). My question is "how can I use C/C++ or perhaps resort to stuff like gpuArray, etc to reduce the computational burden".

Sign in to comment.

Accepted Answer

Jan
Jan on 18 Jul 2022
Just some experiments. You can gain some clarity, but hardly improve the speed with this simplifications. I've tried a loop version also.
n = 1e4;
Ds0 = rand(1, n);
Ds1 = rand(1, n);
Ds2 = rand(1, n);
Dm0 = rand(1, n);
Dm1 = rand(1, n);
Dm2 = rand(1, n);
tic;
for rep = 1:1e4
EZ = [(1.0./Ds0.^2.*(Ds0.*(Dm0.*4.0+Dm0.*Ds1.^2.*2.0-Ds0.*(Ds1.*2.0+Dm0.*Ds2.*2.0+Dm1.*Ds1.*2.0-Dm2.*Ds0)+Dm0.*Dm1.*2.0)-Dm0.^2.*Ds1.*2.0))./4.0;
(1.0./Ds0.^2.*(Ds0.*(Ds0.*(Dm1.*4.0+Ds1.^2-Ds0.*Ds2.*2.0+4.0)-Dm0.*Ds1.*8.0)+Dm0.^2.*4.0))./4.0;
(Dm0.*6.0-Ds0.*Ds1.*3.0)./(Ds0.*2.0)];
end
toc
Elapsed time is 0.794406 seconds.
tic;
for rep = 1:1e4
Ds0_2 = Ds0 .* Ds0;
Dm0_2 = Dm0 .* Dm0;
EZ2 = [(1 ./ Ds0_2 .* (Ds0 .* (Dm0 * 2 + Dm0 .* Ds1 .^ 2 - ...
Ds0 .* (Ds1 + Dm0 .* Ds2 + Dm1 .* Ds1 - Dm2 .* Ds0 ./ 2) + ...
Dm0 .* Dm1) - Dm0_2 .* Ds1)) / 2; ...
1 ./ Ds0_2 .* (Ds0 .* (Ds0 .* (Dm1 + Ds1 .^ 2 / 4 - Ds0 .* Ds2 / 2 + 1) - ...
Dm0 .* Ds1 * 2) + Dm0_2);
(Dm0 * 3 - Ds0 .* Ds1 * 1.5) ./ Ds0];
end
toc
Elapsed time is 0.775081 seconds.
tic;
for rep = 1:1e4
EZ3 = zeros(3, n);
for k = 1:n
a = Ds0(k);
b = Dm0(k);
c = Ds1(k);
d = Dm1(k);
e = Ds2(k);
EZ3(1, k) = (1 / a^2 * (a * (b * 2 + b * c ^ 2 - ...
a * (c + b * e + d * c - Dm2(k) * a / 2) + b * d) - b^2 * c)) / 2;
EZ3(2, k) = (a * (a * (d + c ^ 2 / 4 - a * e / 2 + 1) - b * c * 2) + b^2) / a^2;
EZ3(3, k) = b * 3 / a - c * 1.5;
end
end
toc
Elapsed time is 1.140882 seconds.
max(abs(EZ(:) - EZ2(:)))
ans = 0
max(abs(EZ(:) - EZ3(:)))
ans = 2.3283e-10
  5 Comments

Sign in to comment.

More Answers (0)

Categories

Find more on Performance and Memory in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by