MATLAB Answers

Greg
0

gpuDevice Crashing Matlab

Asked by Greg
on 13 Apr 2012
Latest activity Answered by Yair Carmon on 13 Aug 2015
I'm try to get up and running with GPU computing through the Parallel Computing Toolbox, but I'm having trouble getting the toolbox to work. When I run "gpuDevice", "gpuDeviceCount", or "gpuArray", Matlab instantaneously crashes, leaving only a "6573 Floating point exception" error in my shell window (the number changes every time). The crash leaves behind a "matlab_crash_dump" file, but the file is empty. Has anyone had this problem before and been able to discover what the problem was?
I'm on a Linux machine with a Quadro 4000 GPU and NVidia's 295.20 drivers. I've had this problem since I got the toolbox a few months ago, but at the time assumed it was because I was using an old and unsupported set of drivers. Those have been updated now, but I still get the same problem.
Thanks

  0 Comments

Sign in to comment.

2 Answers

Jason Ross
Answer by Jason Ross
on 13 Apr 2012

What distro? What version of MATLAB? 64 or 32 bit?
If you run "nvidia-smi --query", do you get usable output? How does the device show up in the nvidia-settings application?
Is the Quadro being used for display and compute, or is it compute only?
FWIW when I've seen odd problems like this, the cause has come down to a defective card. Typical setup is to install the driver and start MATLAB, then it works.

  4 Comments

Show 1 older comment
Jason Ross
on 13 Apr 2012
Could you try running it with a single monitor only?
Greg
on 13 Apr 2012
Tried that and got the same crash.
Jason Ross
on 13 Apr 2012
Huh. I'm rapidly running short on ideas.
Do you by chance have the CUDA toolkit / SDK installed? There is an example in there called deviceQuery. I'm wondering if it would give you a response, or crash?
Also, do you have the capability to put this card in another machine and/or use it in Windows? It would be interesting to see if the crash would follow it.

Sign in to comment.


Answer by Yair Carmon on 13 Aug 2015

I had a similar issue on a remote server that ran Ubuntu 12.04, Matlab 2015a, CUDA 7.0, and a GeForce GTX 960. During a routine run of my application, the nvidia-smi utility (which was open using watch nvidia-smi, to monitor GPU utilization) suddenly printed "Error" instead of things like temperature and available memory. A complete system crash followed immediately, and it was necessary to power cycle the machine before it started responding to ping again.
When the system came back online I had the problems reported above: any attempt to run nvidia-smi or gpuDevice/gpuArray would result in a crash. It was not a problem with the card - we swapped GPU's and the issue persisted. Uninstalling and reinstalling the CUDA toolkit using apt-get did not help either. The problem was finally resolved by reinstalling the entire OS, Matlab and CUDA 7.0 in that order. I suspect that using the CUDA 7.0 .run installation might have solved the problem without having to go through OS installation. I hope to never have a chance to check that :).

  0 Comments

Sign in to comment.