Improve GPU utilization during regression deep learning

Question

Adam Shaw am 11 Apr. 2023

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/1945344-improve-gpu-utilization-during-regression-deep-learning

Kommentiert: Joss Knight am 7 Mai 2023

I'm having trouble improving GPU utilization on, I think, a fairly straightforward deep learning example, and wonder if there is anything clearly being done incorrectly - I'm not an expert on this field, and so am not quite sure exactly what information is most relevant to provide.

I'm using a 3090 GPU, the actual neural net architecture is a few fully-connected layers, each with ~100 neurons. The input data is a featureInput with 3 inputs, and ~20k points, going to one regression output.

The relatively sparse training options are as follows:

options = trainingOptions("adam", ...
                        MaxEpochs=500, ...
                        Shuffle="every-epoch", ...
                        InitialLearnRate=0.001,...
                        MiniBatchSize=128);

However, when I train the network, I only reach ~10% gpu utilization. I'm assuming that somehow I'm either being bottlenecked by some other step of the process.

My goal ultimately is actually to train the model ~100s of times, each with different choices of initial data. So in that sense, though my input data is relatively small (which perhaps is leading to a bottleneck?), I'm hoping to find some way to paralellize multiple trainings on the same gpu. Is this possible, or is there some other thing I've clearly overlooked when it comes to improving the utilization?

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Joss Knight am 12 Apr. 2023

What is your data? What does the MATLAB Profiler say about where time is being spent? Have you tried to maximize the MiniBatchSize to improve throughput?

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Aishwarya Shukla am 2 Mai 2023

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/1945344-improve-gpu-utilization-during-regression-deep-learning#answer_1227194

Hi @Adam Shaw

It's hard to say exactly what's causing the low GPU utilization without more information, but here are a few potential issues to consider:

Batch size: With a mini-batch size of 128, it's possible that your GPU is underutilized because the batches are too small to fully occupy the GPU. You could try increasing the batch size to see if that improves GPU utilization.
Data loading: If your data loading process is slow, then the GPU may be waiting for data to arrive during training, leading to low utilization. Consider using data augmentation techniques or pre-loading your data onto the GPU to improve data loading performance.
Model complexity: Your neural network may not be complex enough to fully utilize the GPU. Consider adding more layers or increasing the number of neurons per layer to see if that improves GPU utilization.
Other system constraints: It's possible that your GPU is being bottlenecked by other system constraints, such as CPU or memory bandwidth. You can monitor these metrics during training to see if they are limiting GPU utilization.

Regarding parallel training, it is possible to train multiple models simultaneously on the same GPU using parallel computing libraries such as PyTorch's DistributedDataParallel or TensorFlow's MirroredStrategy. However, keep in mind that training multiple models on the same GPU will increase memory usage, potentially leading to memory errors or slower training times.

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Joss Knight am 7 Mai 2023

Or perhaps, since you're using MATLAB not python, use MATLAB to train multiple models such as described in our documentation .

Even better use the App Experiment Manager which is specifically designed to help with this.

Melden Sie sich an, um zu kommentieren.

Improve GPU utilization during regression deep learning

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Antworten (1)

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Community Treasure Hunt

Improve GPU utilization during regression deep learning

1 Kommentar -1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Antworten (1)

1 Kommentar -1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Community Treasure Hunt

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden