After reboot of my linux system (Ubuntu 18.04) Matlab (R2017a, R2019a) does not find my GPU anymore. However, it (and Cuda too) is still there?

14 Ansichten (letzte 30 Tage)
Dear Sir or Madam,
after rebooting my linux system today, Matlab (R2017a, R2019a) does not find my GPU anymore.
The error message is:
gpuDevice
Error using gpuDevice (line 26)
No supported GPU device was found on this computer. To learn more about
supported GPU devices, see <a
href="matlab:web('http://www.mathworks.com/gpudevice','-browser')">www.mathworks.com/gpudevice</a> }
However, I have not changed anything on hardware and software. Furthermore, I have re-installed CUDA, but it still does not work anymore?
Is that a - to anybody - known issue?
How to solve that?
Usually I am working remote on that machine (via nx-server or via terminal). Both variants do not work.
Many thanks in forward for any idea.
With kind regards

Akzeptierte Antwort

Jason Ross
Jason Ross am 22 Jan. 2020
Note that when you say "install CUDA", it's not clear if you mean you installed the CUDA Toolkit/SDK, or installed the driver. MATLAB only needs the driver installed and running to use the GPU, the toolkit/SDK is not necessary. So I'll assume that something is wrong with your driver.
  • Run "nvidia-smi" in a command line and see if the GPU is reported correctly. It should look something like this:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 00000000:04:00.0 Off | 0 |
| N/A 37C P0 57W / 149W | 0MiB / 11441MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K80 Off | 00000000:05:00.0 Off | 0 |
| N/A 43C P0 70W / 149W | 0MiB / 11441MiB | 51% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
  • If nvidia-smi isn't working, the driver is not running at all. You should try reinstalling the latest driver, and also check that your system is finding the GPU. To see if the system sees the GPU, use
% lspci | grep -i nvidia
04:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
05:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
  • If the GPU shows up in lspci but nvidia-smi doesn't find anything, your driver is not working at all -- reinstall it.
  • If it's working, check the permissions on your GPU device, it should look something like the following. I've seen cases where the GPU is only accessible to root.
% ls -l /dev/nv*
crw-rw-rw- 1 root root 195, 0 Jan 16 16:17 /dev/nvidia0
crw-rw-rw- 1 root root 195, 1 Jan 16 16:17 /dev/nvidia1
crw-rw-rw- 1 root root 195, 255 Jan 16 16:17 /dev/nvidiactl
crw-rw-rw- 1 root root 243, 0 Jan 16 16:17 /dev/nvidia-uvm
crw-rw-rw- 1 root root 243, 1 Jan 16 16:17 /dev/nvidia-uvm-tools
  • If the permissions are wrong, on some driver versions you can set NVreg_DeviceFileMode=0660 to NVreg_DeviceFileMode=0666. The conf file you edit is in /etc/modprobe.d, it's generally called something with "nvidia" in it, for example 50-nvidia.conf. Note that this setting may be depricated, I haven't seen it in a while.
Check and see if the nvidia module is loaded, and that "nouveau" is not.
% lsmod | grep nvidia
nvidia_uvm 917504 0
nvidia_drm 45056 2
nvidia_modeset 1110016 1 nvidia_drm
nvidia 19894272 2 nvidia_modeset,nvidia_uvm
drm_kms_helper 155648 2 mgag200,nvidia_drm
drm 360448 10 mgag200,ttm,nvidia_drm,drm_kms_helper
ipmi_msghandler 49152 2 nvidia,ipmi_si
% lsmod | grep nouveau
(there should be no output)
If nouveau is loaded, you need to blacklist it by adding a file in /etc/modprobe.d that says something like this. The driver installation process usually does this (I have a file called "nouveau-blacklist.conf"), but that might have been removed for some reason. You may not need the lvm-nouveau line, depending on what you are running.
blacklist nouveau
blacklist lvm-nouveau
This will require a reboot to realize the blacklisting.
  4 Kommentare
Wouter Verstraelen
Wouter Verstraelen am 26 Jul. 2021
Hi, does the Nvidia-driver by default include CUDA?
I tried downloading from Nvidia's website but it actually suggested to use the platform-specific one in the 'additional drivers' tab of software & updates. On Ubuntu 20.04 it says "Using NVIDIA driver metapackage from nvidia-driver-470 (proprietary)" but still, while the gpu is found
(01:00.0 VGA compatible controller: NVIDIA Corporation GP107GL [Quadro P400] (rev a1))
there is
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
any idea what could be the problem?
Jason Ross
Jason Ross am 27 Jul. 2021
@Wouter Verstraelen -- the nVidia driver package includes the CUDA driver as part of the installation. If you are using the OS packages I would expect the blacklisting, and other steps to be done for you. You might be able to find some clues as to what's going on in /var/log -- I've used tools like grep to look for anything nvidia related.
You could also try removing the package and then re-running it -- perhaps there was another package that removed something important, or has some kind of conflict?
I've also found many solutions to odd driver problems by Googling or in nVidia's various forums. Ubuntu is a popular distro so I would expect that someone else has seen a similar issue.
if nvidia-smi is failing, though, there's no way MATLAB will find the GPU.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu GPU Computing finden Sie in Help Center und File Exchange

Produkte

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by