Filter löschen
Filter löschen

Error when processing on HPC: Unable to allocate space for the FFT calculation. This might be due to insufficient memory on the GPU.

1 Ansicht (letzte 30 Tage)
Hello,
Error message: Unable to allocate space for the FFT calculation. This might be due to insufficient memory on the GPU.
I received this error message when I'm processing multiple images on a Slurm server. The code used both GPU and multi-core computing. The for loop goes over all the images are not parallelized, within each image, the cores work together to produce the result for this simgle image.
The error message shows up after going through around 4000 images. I tried to clear all the variables after completing every single image, and reset the GPU device every 2000 images, and the error message is still there.
The error results in a stop in calculation, the server gets a return 0 message (which means a normal exit on the server).
Please help.

Akzeptierte Antwort

Joss Knight
Joss Knight am 5 Mär. 2023
At a guess you are trying to share one GPU between multiple workers on a pool. Depending on how work is scheduled one or two workers may have allocated all GPU memory leaving none for others.
Options:
  • Reduce the size of the pool
  • If on R2022b or later, try setting the gpuDevice CachePolicy to "minimum"
  • Place your code inside a try... catch block and ignore out of memory errors, or use the CPU instead if the GPU errors
  5 Kommentare
Joss Knight
Joss Knight am 6 Mär. 2023
The ifft is computed in a data-parallel way but there is no overlap between the computations being run on different workers that share a GPU. Some overheads will be reduced but the main gains you see will be the fact that you have 4 GPUs.
Xiguang Zhang
Xiguang Zhang am 10 Mär. 2023
Thanks for the information. The error does not show up after reducing the number of workers.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Cluster Configuration finden Sie in Help Center und File Exchange

Produkte


Version

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by