How to speed up MEX function?
Ältere Kommentare anzeigen
following mex code is running too slow, but I don't know why it is and how to make it faster. Any help is greatly appreciated!
calculate_my_way.cpp
#include "mex.hpp"
#include "mexAdapter.hpp"
#include <cmath>
class MexFunction : public matlab::mex::Function {
public:
void operator()(matlab::mex::ArgumentList outputs, matlab::mex::ArgumentList inputs) {
matlab::data::TypedArray<double> var0 = inputs[0];
matlab::data::TypedArray<double> var1 = inputs[1];
matlab::data::TypedArray<double> var2 = inputs[2];
matlab::data::TypedArray<double> var3 = inputs[3];
auto var0Iter = var0.begin();
auto var1Iter = var1.begin();
auto var2Iter = var2.begin();
auto var3Iter = var3.begin();
const int numOfElements = var0.getNumberOfElements();
double buffer = 0;
for (int x = 0; x<numOfElements; x++)
{
buffer = std::sin(*var0Iter) + std::sin(*var1Iter) + std::sin(*var2Iter) + std::cos(*var3Iter);
*var0Iter = buffer;
buffer = std::sin(*var1Iter + *var2Iter) + std::cos(*var3Iter);
*var1Iter = buffer;
var0Iter++;
var1Iter++;
var2Iter++;
var3Iter++;
}
outputs[0] = std::move(var0);
outputs[1] = std::move(var1);
}
};
It's just simple calculation, but this code runs even slower than native distance function which performs a lot more complicated calculation than just a few sin+cos.
I'm using compiler that came with Visual Studio 2017. below is how I run mex and the compiler setup info.
mex -v calculate_my_way.cpp
...
Compiler location: C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\
...
OPTIMFLAGS : /O2 /Oy- /DNDEBUG
and this is how I am seeing performance issues.
clear
size_test = 1e7;
var1 = zeros(size_test, 1);
var2 = zeros(size_test, 1);
var3 = zeros(size_test, 1);
var4 = zeros(size_test, 1);
cant_beat_me = @() distance(var1,var2,var3,var4);
elapsed_time = timeit(cant_beat_me);
mex_slow = @() calculate_my_way(var1,var2,var3,var4);
elapsed_time = timeit(mex_slow);
15 Kommentare
Rik
am 2 Nov. 2022
Apart from the segfault if var1 is longer than the others, did you try with a random test set as well? The distance function may have some calls optimized away.
I might be able to try this code on my desktop later today.
Walter Roberson
am 2 Nov. 2022
buffer = std::sin(*var0Iter) + std::sin(*var1Iter) + std::sin(*var2Iter) + std::cos(*var3Iter);
*var0Iter = buffer;
buffer = std::sin(*var1Iter + *var2Iter) + std::cos(*var3Iter);
You calculate std::cos(*var3Iter) twice
Yifan Lin
am 2 Nov. 2022
Bruno Luong
am 2 Nov. 2022
Bearbeitet: Bruno Luong
am 2 Nov. 2022
"I'm guessing this is a compiler choice? Does matlab uses intel compiler that I don't have?"
I have Intel compiler I can test.
But Matlab can implement with vector arithmetics with multi-threading, you also could with OpenMP.
There are few people here that do miracles with Mex programing, James Tursa and Jan Simon to cite fews, but I believe they are C oriented and less C++.
Walter Roberson
am 2 Nov. 2022
Which distance function are you comparing to?
Yifan Lin
am 2 Nov. 2022
Walter Roberson
am 2 Nov. 2022
The Mapping Toolbox distance() function is not coded in mex. You can read the MATLAB source code for it. The code converts the angles to radians, and then uses its local function greatcircledist() to compute using the haversine formula, and then does something that I do not recognize at the moment involving atan2() -- at least for the default calculation. There is a different code path if you use some of the options.
Bruno Luong
am 2 Nov. 2022
timeit result of your code with VS compiler and Intel OneAPI compiler (2022)
VS_elapsed_time % 0.1795
Intel_elapsed_time % 0.1781
Bruno Luong
am 2 Nov. 2022
Bearbeitet: Bruno Luong
am 2 Nov. 2022
Obviously evalutae cos/sin depends run time on data
Compare between MATLAB and cpp with zero data
clear
size_test = 1e7;
var1 = zeros(size_test, 1);
var2 = zeros(size_test, 1);
var3 = zeros(size_test, 1);
var4 = zeros(size_test, 1);
cant_beat_me = @() distance(var1,var2,var3,var4);
mex_slow = @() calculate_my_way(var1,var2,var3,var4);
MATLAB_elapsed_time = timeit(cant_beat_me) % 0.0274
Intel_elapsed_time = timeit(mex_slow) % 0.1803
function [out0,out1] = distance(var0, var1, var2, var3)
out0 = sin(var0) + sin(var1) + sin(var2) + cos(var3);
out1 = sin(var1 + var2) + cos(var3);
end
with random data
clear
size_test = 1e7;
var1 = 2*pi*rand(size_test, 1);
var2 = 2*pi*rand(size_test, 1);
var3 = 2*pi*rand(size_test, 1);
var4 = 2*pi*rand(size_test, 1);
cant_beat_me = @() distance(var1,var2,var3,var4);
mex_slow = @() calculate_my_way(var1,var2,var3,var4);
MATLAB_elapsed_time = timeit(cant_beat_me) % 0.1560
Intel_elapsed_time = timeit(mex_slow) % 0.5101
The factor of
>> 0.5101/0.156
ans =
3.2699
could be well explained by multi-thread.
Yifan Lin
am 2 Nov. 2022
Bruno Luong
am 2 Nov. 2022
Or stay with MATLAB?
Yifan Lin
am 2 Nov. 2022
Yifan Lin
am 2 Nov. 2022
Bruno Luong
am 3 Nov. 2022
By curiosity I code the same calculation in C. Time is 0.24 sec; twice faster than C++ (0.5 sec) but 60% slower than MATLAB (0.147 sec).
/* mex -g -R2018a calculate_C_way.c */
#include "mex.h"
#include <math.h>
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
int i, n;
double *var0Iter, *var1Iter, *var2Iter, *var3Iter, *out0Iter, *out1Iter;
n = mxGetNumberOfElements(prhs[0]);
plhs[0] = mxCreateNumericMatrix(1, n, mxDOUBLE_CLASS, mxREAL);
plhs[1] = mxCreateNumericMatrix(1, n, mxDOUBLE_CLASS, mxREAL);
var0Iter = mxGetDoubles(prhs[0]);
var1Iter = mxGetDoubles(prhs[1]);
var2Iter = mxGetDoubles(prhs[2]);
var3Iter = mxGetDoubles(prhs[3]);
out0Iter = mxGetDoubles(plhs[0]);
out1Iter = mxGetDoubles(plhs[1]);
for (i = 0; i < n; i++) {
*out0Iter = sin(*var0Iter) + sin(*var1Iter) + sin(*var2Iter) + cos(*var3Iter);
*out1Iter = sin(*var1Iter + *var2Iter) + cos(*var3Iter);
out0Iter++;
out1Iter++;
var0Iter++;
var1Iter++;
var2Iter++;
var3Iter++;
}
}
Yifan Lin
am 3 Nov. 2022
Akzeptierte Antwort
Weitere Antworten (1)
Bruno Luong
am 2 Nov. 2022
Bearbeitet: Bruno Luong
am 2 Nov. 2022
I don't know well C++, but I have practiced quite a lot mex C.
It looks like this statement just move a bunch of data
outputs[0] = std::move(var0);
outputs[1] = std::move(var1);
ALso I wonder if your input "0, and 1 would change
*var0Iter = buffer;
...
*var1Iter = buffer;
after calling the mex, which is NOT allowed.
2 Kommentare
Yifan Lin
am 2 Nov. 2022
Bruno Luong
am 2 Nov. 2022
" Another one of your answer here helped me tremendously a few years back! thank you! "
Oh... realy glad to read that...
Kategorien
Mehr zu Write C Functions Callable from MATLAB (MEX Files) finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!