Upgrading PC, tips to finding bottlenecks?

12 Ansichten (letzte 30 Tage)
Joshua Lauzier
Joshua Lauzier am 28 Nov. 2016
Kommentiert: Walter Roberson am 29 Nov. 2016
Hello,
We're currently running a simulation(simulating a grid of resistors), the main portion of it is solving a large (~1milx1mil or bigger) sparse SPD matrix, using a preconditioned conjugate gradient (PCG) method, many times (at each time step).
Is there a way to check whether memory bandwidth is an issue? We're currently looking to get an i7 6700K, but that would restrict us to dual channel RAM. I think this is probably fine, but it would be nice to confirm before we buy any hardware.
Also, Matlab's sparse library requires double precision. However, when looking at GPUs, it seems that they're optimized for single precision. When Matlab is doing GPU calculations, does it use psuedo double precision (which would incur a slowdown of ~1/32) ?
In addition, is there a way to check whether the PCG method would work in single precision? It would cause error, but we'd like to know if it's an acceptable amount or not. If it's too much error, we'd have to stick to CPU calculations, vs GPU. From what I've found online, most sparse libraries are done in double precision, and any error if you force single precision is at your own risk

Antworten (1)

Walter Roberson
Walter Roberson am 28 Nov. 2016
"When Matlab is doing GPU calculations, does it use psuedo double precision"
No. The Parallel Computing Toolbox requires GPUs with enough compute capability to handle double themselves. The precision it uses for a GPU operation is whatever precision is associated with the data to be processed. If your GPU only does double precision slowly then the result will be slow. MATLAB makes no attempt to emulate double precision with single precision.
"However, when looking at GPUs, it seems that they're optimized for single precision."
As you are looking at getting a new system with GPU, you should be considering getting a system with one of the new Pascal series of architectures, which offer much higher performance on double precision than previous architectures did. The new series is available to some supercomputer and academic sites in the USA, and is expected to be released to the public in January.
Caution: there are some operations that the Pascal architecture library does not handle properly yet due to bugs on the NVidia side. One of the major affected algorithms is Convoluntional Neural Networks. I have no information about PCG working on Pascal or not working.
Looking at https://www.mathworks.com/matlabcentral/answers/285134-preconditioning-algorithm-on-gpu-for-solution-of-sparse-matrices and following the links to http://docs.nvidia.com/cuda/cusparse/#cusparse-lt-t-gt-csrsmsolve I see that NVidia does implement cuSPARSE for single precision as well as double precision.
  5 Kommentare
Walter Roberson
Walter Roberson am 29 Nov. 2016
Working notes, specs put together from various sources
  • Titan X - Maxwell. 200 Gflop FP64. GM200 CPU
  • Titan Z - Kepler. 2700 Gflop FP64
  • Titan Black - Kepler. 1707 Gflop FP64
  • Titan X (Pascal) - Pascal. GP102 CPU. FP32 = 11 teraflop, FP64 = 1/32 FP32, so should be about 330 - 343 Gflop
  • Tesla P100 - Pascal. FP64 = 4.7 teraflop
"While the Titan X [Pascal] has 40 percent more cores, 50 percent more memory and significantly more memory bandwidth than a GTX 1080, its clock speed is lower and it is more limited by power and thermals, all of which eats into its advantage. "
Walter Roberson
Walter Roberson am 29 Nov. 2016
So, uh, yes, Titan Z (Kepler) or Titan X Black (Kepler) would beat Titan X Pascal handily for FP64.
Looking at these specs, and at the prices, it almost looks to me as if most cost effective would be to go for putting in dual slower cards, like two of the older Titan X Maxwell GM200 cards. The disadvantage would be the need to partition the work between the two GPU. Hmmm, possibly your particular application is not suited for that. If you could distribute, then a tower of $200 graphics cards might be the most FP64 for the dollar. (Only one tower; after that you would need to go for MDCS licenses.)

Melden Sie sich an, um zu kommentieren.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by