Overcoming VRAM limitations on Nvidia A100
Ältere Kommentare anzeigen
I have access to a cluster with several Nvidia A100 40GB GPU's. I am training a deep learning network on these GPU's, however using trainNetwork() only makes use of around 10GB of the GPU's vRAM. I beleive this is a limitation of Nvidia Cuda, see here.
I have two related questions;
- Other cluster users are writting in python with the 'DistributedDataParallel' module in PyTorch and are able to load in 40Gb of data (over the cuda limitation) onto the GPU's; is there a similar work around for MATLAB?
- If this isn't the case is there any way to use Multi-instance GPU's, so essentially split the physical card into several smaller virtual GPU's and compute in parrellel?
Ideally I would like to speed up computation, so having a 3/4 of the vRAM empty which could otherwise be used for mini-batches is a little heart breaking.
Akzeptierte Antwort
Weitere Antworten (0)
Kategorien
Mehr zu Parallel and Cloud finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
