I'm trying to speedup my Matlab program using GPU, but since the function "knnsearch" does not support GPU, I decided to write my own brute-force function
(Of course there are other options, like writing CUDA kernels, or using existing libraries, but setting them up must take a while, and I'm on windows 10, even setting up matlab to work with mex files is a pain, with errors keep popping up...., so i will stick with the brute-force method for now).
Even though it's just brute-force search, I have seen some great improvements (code runs 2x times faster). Here is my function, along with the profiling results (it's really short)
The function takes in a single query point, and find the K-nearest neighbors among the reference points (all of the points are stored in gpuArray). At first, I thought that the calculations of distances would take the most time, however as you can see in the profiler results, the sorting part was the most time-consuming.
I would like to improve upon the code, by optimizing the part where you find the smallest K elements in an array. Please note that K is very small (K < = 50) whereas the number of data points is very large (>= 10000). I understand that there are algorithms tailored to this specific task (heapsort for example, but you just need the heap bulding part), but I'm not sure how to implement them using GPU and not slowing things down. Can anyone help me? Thank you very much :)