gpuArray slower on newer graphics card in double precision
1 Ansicht (letzte 30 Tage)
Ältere Kommentare anzeigen
Matt J
am 31 Jul. 2015
Bearbeitet: Matt J
am 3 Aug. 2015
I have been making the following speed test in R2015a on two different computers running two different graphics cards,
>> A=gpuArray(rand(5e3));
>> T=gputimeit(@()A*A)
The first computer is an older model (Dell Precision T7500) running an older graphics card (GTX 580). The second, newer computer (Dell Precision Tower 7910) is running a newer graphics card (Titan X).
Oddly, I find that the older configuration outperforms the newer by about 20%. The GTX 580 gives T=1.1178 seconds, whereas the Titan X gives T=1.3097 seconds. When I redo the test in single precision,
>> A=gpuArray(rand(5e3,'single'));
>> T=gputimeit(@()A*A)
the results are more in line with my expectations. The GTX 580 gives T=0.2121 seconds, whereas the Titan X gives T=0.0491 seconds.
I'm wondering what could account for this difference. One thing that might be worth mentioning is that the Titan X is not using a fully updated driver. At the time of this writing, there is some bug in its newest driver release, making it unusable, and I am instead using driver version 353.62. Could this be the reason? If not, any other ideas?
7 Kommentare
Akzeptierte Antwort
Brendan Hamm
am 3 Aug. 2015
The Titan X is a terrible card to use for double precision GPGPU as it was designed as a cheaper alternative to other Titans with a focus on single precision (gaming). You will see that the GFLOPS for double precision is about 1/32 that of single precision on the Maxwell chips. Compare that with the Fermi architecture used on the GTX 580 which has 1/5 the GFLOPS for double precision compared with its single precision. If you intended to use this for double precision I would highly recommend using the Titan Z (or Black) which uses the Kepler architecture. Therefore if you have a Titan Black, this would not be rolling back at all, but rather using a card which considered double precision as being important.
1 Kommentar
Brendan Hamm
am 3 Aug. 2015
Bearbeitet: Brendan Hamm
am 3 Aug. 2015
For single precision work, the Titan X is the card to use, so looks like you made a good choice. It does have less cores than the Titan Z, but a higher clock rate and a lower price point.
Weitere Antworten (0)
Siehe auch
Kategorien
Mehr zu Language Fundamentals finden Sie in Help Center und File Exchange
Produkte
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!