MATLAB Answers

How to efficiently implement k-nearest neighbor algorithm in a gpu?

9 views (last 30 days)
Dang Manh Truong
Dang Manh Truong on 13 Feb 2017
Commented: Hao Zhang on 18 Dec 2018
Hi, here is my gpu information:
>> gpuDevice
ans =
CUDADevice with properties:
Name: 'Quadro M1000M'
Index: 1
ComputeCapability: '5.0'
SupportsDouble: 1
DriverVersion: 8
ToolkitVersion: 7.5000
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 2.1475e+09
AvailableMemory: 1.6919e+09
MultiprocessorCount: 4
ClockRateKHz: 1071500
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1
So I would like to implement k-nearest neighbor using gpu. Here is the scope of my problem: for each of the 10000 query points, I need to find k-nearest neighbors among 1 million reference points with k <= 50. At first I thought this would not be difficult. However, when I checked an implementation here: , it says that the number of reference points should not be more than 65536 :( . Why is it so? And what happens if the points cannot all fit into memory? Would the advantage offered by GPU computing be lost then, since we would have to repeatedly transfer data back and forth between memory and the GPU? And what about kd-tree? I've heard that it is highly recursive and therefore not very suitable for gpu. So should i concentrate on brute-force? Please help me,thank you very much. (Currently, I'm using the Approximate Neighbor Search , inside a parfor loop, yeah its just as simple as that, and I have achieved some speedups, but I'm not sure if it would be as fast as I want it to be with 1 million reference points. And I need to do this k-nn search many many times, say 1000 times :( ).


Dang Manh Truong
Dang Manh Truong on 9 Mar 2017
It runs out of memory on the GPU (about 2 millons of data points, dimensions about 3 millions, of course they are sparse)

Sign in to comment.

Answers (0)


Translated by