Neural networks - CUDAKernel/setConstantMemory - the data supplied is too big for constant 'hintsD'
3 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Jacob Townsend
am 12 Dez. 2016
Beantwortet: Jacob Townsend
am 19 Feb. 2017
On R2015a with Parallel Computing Toolbox and Neural Network Toolbox.
Using the following code with GPU Nvidia GeForce GTX980 Ti:
net1 = feedforwardnet(20);
net1.trainFcn = 'trainscg';
x = inputs(1:4284,2:2000)'; % if I reduce this to 2:1900, it will work
t = double(targets'); % casting to double for GPU
t = t(:,1:4284);
% preparing for GPU xg = nndata2gpu(x); tg = nndata2gpu(t);
net1.input.processFcns = {'mapminmax'}; net1.output.processFcns = {'mapminmax'};
net2 = configure(net1,x,t); % Configure with MATLAB arrays
net2 = train(net2,xg,tg);
As you can see, this is not a big dataset. When I run this, it generates this error:
Error using parallel.gpu.CUDAKernel/setConstantMemory The data supplied is too big for constant 'hintsD'.
Error in nnGPU.codeHints (line 33) setConstantMemory(hints.yKernel,'hintsD',hints.double);
Error in nncalc.setup2 (line 13) calcHints = calcMode.codeHints(calcHints);
Error in nncalc.setup (line 17) [calcLib,calcNet] = nncalc.setup2(calcMode,calcNet,calcData,calcHints);
Error in network/train (line 357) [calcLib,calcNet,net,resourceText] = nncalc.setup(calcMode,net,data);
gpuDevice is showing this:
Name: 'GeForce GTX 980 Ti'
Index: 1
ComputeCapability: '5.2'
SupportsDouble: 1
DriverVersion: 8
ToolkitVersion: 6.5000
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 6.4425e+09
AvailableMemory: 5.1520e+09
MultiprocessorCount: 22
ClockRateKHz: 1139500
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1
As noted in the code above, if I reduce x marginally, it will run.
I don't understand why data of this size would generate a memory error?
Am I missing a step in preparing this for GPU?
0 Kommentare
Akzeptierte Antwort
Amanjit Dulai
am 3 Jan. 2017
I was able to reproduce your issue. The best solution is to do the GPU training a different way by using the 'useGPU' flag. This does not use the shared memory in this way, and side-steps this issue. Your example code would look like this:
net1 = feedforwardnet(20);
net1.trainFcn = 'trainscg';
x = inputs(1:4284,2:2000)';
t = double(targets'); % casting to double for GPU
t = t(:,1:4284);
net1.input.processFcns = {'mapminmax'};
net1.output.processFcns = {'mapminmax'};
net1 = train(net1,x,t,'useGPU','yes');
0 Kommentare
Weitere Antworten (2)
Joss Knight
am 19 Dez. 2016
Constant memory is a special fast read-only cache with 64KB of space. That's enough to store about 8000 elements of double-precision data. Perhaps you want to use shared memory, which will give you 16 or 48MB depending on your device configuration.
2 Kommentare
Joss Knight
am 20 Dez. 2016
Sorry, I don't understand this code to know what hints.double is and why it needs to be so large. I'll see if I can get someone who knows to help you.
Siehe auch
Kategorien
Mehr zu Sequence and Numeric Feature Data Workflows finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!