Why do I get CUDA execution errors when training my network on a GPU?

1 Ansicht (letzte 30 Tage)
Why do I get the following error when training my neural network:
An unexpected error occurred during CUDA execution. The CUDA error was:
all CUDA-capable devices are busy or unavailable
The above only happens on a GPU and not on the CPU.

Akzeptierte Antwort

MathWorks Support Team
MathWorks Support Team am 19 Mai 2021
Bearbeitet: MathWorks Support Team am 19 Mai 2021
We suspect that the most likely issue is a kernel execution timeout.
To confirm this you can try running some GPUarray commands, such as:
A = gpuArray(rand(10))
B = A+1
If the above runs without any warnings and errors, it is likely due to kernel timeouts.
Some possible workarounds:
  1. You have to scale down your problem to make sure it does not timeout (e.g. with a smaller network, or data size) or use a different card that does not timeout.
  2. Some GPUs allow one to set the compute mode to computations (TCC) only but others don't. As a possible workaround check if your GPU allows changing to that mode.
  3. Another possible workaround is to modify the registry to increase the TDR delay value as explained in the web page below:

Weitere Antworten (0)

Kategorien

Mehr zu GPU Computing finden Sie in Help Center und File Exchange

Produkte


Version

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by