Why do some calculations like the FFT produce different results when performed on a GPU?
Ältere Kommentare anzeigen
I am using the Parallel Computing Tbx and my computer's GPU to speed up calculations, but sometimes, the results are not identical. E.g. FFT produces different results. Why is that?
8 Kommentare
Mark Shore
am 8 Feb. 2011
Can you give more details (operating system, MATLAB version, CUDA version and GPU hardware), as well as how, exactly, you are determining inconsistent results?
Karl
am 9 Feb. 2011
Moshe Tur
am 21 Okt. 2022
Hi,
I have an urgent, somewhat related question:
I am in th eprocess of puchasing a GPU card to accelerate my floating point, double precsion (actually doubple precision COMPLEX numbers).
What is the recommended one among: Nvidia 3090Ti, A5000, A6000?
Tnx, Prof. Moshe Tur
Moshe Tur
am 21 Okt. 2022
Apology for the use the word 'urgent': I read the guidelines only AFTER I posted my message. Moshe
Walter Roberson
am 21 Okt. 2022
https://develop3d.com/workstations/nvidia-quadro-rtx-a6000-gpu-launches-with-2x-performance-boost/
"The Nvidia RTX A6000 is focussed very much on graphics. It does not have accelerated double precision performance, which is important for applications including engineering simulation. Nvidia told DEVELOP3D that for customers that need double precision there’s the Quadro GV100 or Nvidia A100 for the data centre."
Walter Roberson
am 21 Okt. 2022
https://askgeek.io/en/gpus/NVIDIA/RTX-A5000
A5000 double precision is 1:32 of single precision, which is the worst ratio that Nvidia makes.
Walter Roberson
am 21 Okt. 2022
I am having trouble finding explicit statements but single precision and double precision numbers are given in this link and the double precision is 1/32 of the single precision https://www.electroniclinic.com/nvidia-geforce-rtx-3090-ti-complete-review-with-benchmarks/
Walter Roberson
am 21 Okt. 2022
The summary of the above 3 is that none of the three are suitable for double precision
Akzeptierte Antwort
Weitere Antworten (2)
Edric Ellis
am 8 Feb. 2011
7 Stimmen
Walter is quite right that any change to the order of operations changes the result. In the case of FFT on the GPU, we use NVIDIA's "CuFFT" library to give a high performance FFT implementation. The highly threaded communicating nature of FFT on the GPU inevitably leads to discrepancies.
In general, we strive to make our GPU algorithms give the numerically consistent "MATLAB answer". For many of the elementwise non-communicating algorithms (sin, cos, plus, ...), we achieve that (within an "eps" or maybe two); but as the complexity of the algorithm increases, so does the discrepancy. (For example, the parallel version of "sum" on the GPU is a vastly different implementation compared to the obvious single-threaded approach).
Walter Roberson
am 8 Feb. 2011
3 Stimmen
Nearly any change in the exact order of operations used to perform a calculation can result in different outcomes due to precision or round-off limitations.
If exact reproducibility of the calculation in different implementations is important, then you very likely should not be using the parallel processing toolbox -- not unless you have studied Numerical Analysis for a few years.
Kategorien
Mehr zu GPU Computing finden Sie in Hilfe-Center und File Exchange
Produkte
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!