GPU problem CUDA_ERROR_UNKNOWN

23 Ansichten (letzte 30 Tage)
Peter
Peter am 4 Jan. 2017
Verschoben: Matt J am 30 Mär. 2023
I'm running a matlab simulation code using an iterative matrix equation solver. This solver is called on the GPU every few time steps in a time stepping loop. This goes well for some dozens of time steps (although the computations gradually slow down...) until the screen goes black for a short instant of time and the simulation crashes with the following error message:
Error using gpuArray/subsasgn
An unexpected error occurred during CUDA execution. The CUDA error was:
CUDA_ERROR_UNKNOWN
After this, Matlab does not recognize the GPU device anymore: the command
gpuDevice
results in:
Error using gpuDevice (line 26)
An unexpected error occurred trying to retrieve CUDA device properties. The CUDA error was:
CUDA_ERROR_UNKNOWN
Restarting matlab is not sufficient to restore the GPU. Restarting the PC is.
I'm running matlab 2016b on windows 10, using an Nvidia TITAN X (Pascal) GPU with the newest driver installed.
Do the above symptoms inspire anyone for a diagnosis of this problem?
  4 Kommentare
Xubin Lin
Xubin Lin am 13 Jun. 2020
Dear Joss,
I also have the same problem.
An error occurred during PTX compilation of <image>.
The information log was:
The error log was:
The CUDA error code was: CUDA_ERROR_ILLEGAL_ADDRESS.
My output of gpuDevice is as follows(matlabR2019a and CUDA 10.2):
Name: 'GeForce GTX 1060'
Index: 1
ComputeCapability: '6.1'
SupportsDouble: 1
DriverVersion: 11
ToolkitVersion: 10
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 6.4425e+09
MultiprocessorCount: 10
ClockRateKHz: 1670500
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1
Swati Jain
Swati Jain am 30 Mär. 2023
Verschoben: Matt J am 30 Mär. 2023
I'm facing this error while working on Deep Network Designer.
Please help me in solving this error.

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Peter
Peter am 6 Jan. 2017
Monitoring the GPU performance revealed that most probably the temperature is causing the issue: Slowing down of performance goes with rising of temperature and performance is capped by temperature.
Crash of the GPU occurred when GPU reached 95 degrees...
  2 Kommentare
Vaclav Bocek
Vaclav Bocek am 19 Apr. 2018
Verschoben: Matt J am 30 Mär. 2023
How did you solved it please?
Peter
Peter am 23 Apr. 2018
Verschoben: Matt J am 30 Mär. 2023
I solved it by: 1) a smarter placement of the GPU in the pc casing, allowing for better air-flow 2) change the behavior of the cooling fan: generally it only reacts to CPU activity. can be set in BIOS I believe. just made it blow a little harder. This is all very machine specific so it will take some investigating on your part to try these options.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (1)

Matt J
Matt J am 4 Jan. 2017
I've had symptoms like that before. Re-installing/updating the GPU driver fixed it for me, but it was never clear to me what the root cause was.
  1 Kommentar
Peter
Peter am 5 Jan. 2017
thanks Matt, I did install the latest drivers (several times now) hoping for it to solve the issue but unfortunately without success.

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu GPU Computing finden Sie in Help Center und File Exchange

Produkte

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by