numerical instabilites for GPU results

I run this code
T=randn(10000,64);
data=randn(1000,64,10);
Tg=gpuArray(T);
datag=gpuArray(data);
res=zeros(10000,1000);
resg=gpuArray(res);
for i=1:10
res=res+T*data(:,:,i)';
end
for i=1:10
resg=resg+Tg*datag(:,:,i)';
end
resg=gather(resg);
norm(res-resg,'fro')/norm(res,'fro')
where I would expect "res" (CPU comptuted) and "resg" (GPU computed) to be the same, but they are not.
I am running this on a Tesla Card, i.e.
gpuDevice
ans =
parallel.gpu.CUDADevice handle
Package: parallel.gpu
Properties:
Name: 'Tesla C1060'
Index: 1
ComputeCapability: '1.3'
SupportsDouble: 1
DriverVersion: 3.2000
MaxThreadsPerBlock: 512
MaxShmemPerBlock: 16384
MaxThreadBlockSize: [512 512 64]
MaxGridSize: [65535 65535]
SIMDWidth: 32
TotalMemory: 4.2948e+09
FreeMemory: 4.0671e+09
MultiprocessorCount: 30
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 0
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1
Methods, Events, Superclasses

3 Kommentare

James Tursa
James Tursa am 18 Mai 2011
I would presume that this is simply the difference in how the BLAS matrix multiply routines are coded on the GPU vs CPU (different blocking, etc). What kind of differences are you seeing?
Felix
Felix am 18 Mai 2011
There are large numerical differences, i.e.norm(res-resg,'fro')/norm(res,'fro') returns something on the order of 1e234. These are clearly no subtle BLAS differences. I suspect there is something wrong when moving data between the CPU and the GPU?
Gaszton
Gaszton am 19 Mai 2011
I runned the code on my gt425m:
ans =
2.4946e-016

Melden Sie sich an, um zu kommentieren.

 Akzeptierte Antwort

Felix
Felix am 20 Mai 2011

0 Stimmen

I upgraded to the latest drivers
270.41.19
, which seems to have fixed the problem.

1 Kommentar

James Tursa
James Tursa am 20 Mai 2011
FYI, it is bad form to accept your own answer when Edric was the one that suggested updating your drivers.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (1)

Edric Ellis
Edric Ellis am 19 Mai 2011

2 Stimmen

I've just run this using R2011a on Linux and Windows using C1060 cards, and in each case the final "norm" calculation gives a result of around 2e-16. So, this should work! Could you post the output of running
parallel.internal.gpu.CUDADriverVersion
and
ver distcomp

4 Kommentare

Felix
Felix am 19 Mai 2011
I should add that I ran this code on 3 different devices, i.e.
2 C1060s and a GTX285, all on the same computer and I get the same discrepancy on all of them, so I would suspect it is not a hardware problem
Edric Ellis
Edric Ellis am 20 Mai 2011
Very strange, I've run on a whole series of different x64 Linux machines here and not seen the problem. That driver is slightly older than the ones we use here, perhaps you could try updating. Also, do you know if it's the matrix multiplication that is introducing the problem?
Felix
Felix am 20 Mai 2011
what is your driver version?
When I run this:
T=randn(10000,64);
A=randn(1000,64);
Ag=gpuArray(A);
Tg=gpuArray(T);
res=gather(Tg*Ag');
norm(res-T*A','fro')/norm(T*A','fro')
I get ~1e-16 at first and ~0.05 on repeated runs, so there is a problem in the matrix mult.
Sean de Wolski
Sean de Wolski am 14 Mär. 2012
Copying Felix' first post with license censored:
Here it is:
parallel.internal.gpu.CUDADriverVersion
ans =
260.19.26
ver distcomp
-------------------------------------------------------------------------------------
MATLAB Version 7.12.0.635 (R2011a)
MATLAB License Number: ############
Operating System: Linux 2.6.30.10-105.2.23.fc11.x86_64 #1 SMP Thu Feb 11 07:06:34 UTC 2010 x86_64
Java VM Version: Java 1.6.0_17-b04 with Sun Microsystems Inc. Java HotSpot(TM) 64-Bit Server VM mixed mode
-------------------------------------------------------------------------------------
Parallel Computing Toolbox Version 5.1 (R2011a)

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Parallel Computing Toolbox finden Sie in Hilfe-Center und File Exchange

Gefragt:

am 18 Mai 2011

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by