Main Content

gpuDevice

Query or select a GPU device

Description

A GPUDevice object represents a graphic processing unit (GPU) in your computer. You can use the GPU to run MATLAB® code that supports gpuArray variables or execute CUDA kernels using CUDAKernel objects.

You can use a GPUDevice object to inspect the properties of your GPU device, reset the GPU device, or wait for your GPU to finish executing a computation. To obtain a GPUDevice object, use the gpuDevice function. You can also select or deselect your GPU device using the gpuDevice function. If you have access to multiple GPUs, use the gpuDevice function to choose a specific GPU device on which to execute your code.

You do not need to use a GPUDevice object to run functions on a GPU. For more information on how to use GPU-enabled functions, see Run MATLAB Functions on a GPU.

Creation

Description

gpuDevice displays the properties of the currently selected GPU device. If there is no currently selected device, gpuDevice selects the default device without clearing it. Use this syntax when you want to inspect the properties of your GPU device.

example

D = gpuDevice returns a GPUDevice object representing the currently selected device. If there is no currently selected device, gpuDevice selects the default device and returns a GPUDevice object representing that device without clearing it.

example

D = gpuDevice(indx) selects the GPU device specified by index indx. If the specified GPU device is not supported, an error occurs. This syntax resets the specified device and clears its memory, even if the device is already currently selected (equivalent to the reset function). All workspace variables representing gpuArray or CUDAKernel variables are now invalid and must be cleared from the workspace or redefined.

example

gpuDevice([]), with an empty argument (as opposed to no argument), deselects the GPU device and clears its memory of gpuArray and CUDAKernel variables. This syntax leaves no GPU device selected as the current device.

Input Arguments

expand all

Index of the GPU device, specified as an integer in the range 1 to gpuDeviceCount.

Example: gpuDevice(1);

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Properties

expand all

This property is read-only.

Name of the GPU device, specified as a character array. The name assigned to the device is derived from the GPU device model.

This property is read-only.

Index of the GPU device, specified as an integer in the range 1 to gpuDeviceCount. Use this index to select a particular GPU device.

This property is read-only.

Computational capability of the GPU device, specified as a character array. To use the selected GPU device in MATLAB, ComputeCapability must meet the required specification in GPU Computing Requirements.

This property is read-only.

Flag for support for double precision operations, specified as the logical values 0 for false or 1 for true.

Since R2023a

This property is read-only.

Graphics driver version currently in use by the GPU device, specified as a character array.

Download the latest graphics driver for your GPU at NVIDIA Driver Downloads.

Since R2023a

This property is read-only.

Operating model of the graphics driver, specified as one of these values:

  • 'WDDM' – Use the display operating model.

  • 'TCC' – Use the compute operating model. 'TCC' disables Windows® graphics and can improve the performance of large scale calculations.

  • 'N/A''WDDM' and 'TCC' are only available on Windows. On other operating systems the driver model is 'N/A'.

For more information about changing models and which GPU devices support 'TCC', see the NVIDIA® documentation.

This property is read-only.

CUDA toolkit version used by the current release of MATLAB, specified as a scalar value.

This property is read-only.

Maximum supported number of threads per block during CUDAKernel execution, specified as a scalar value.

Example: 1024

This property is read-only.

Maximum supported amount of shared memory that a thread block can use during CUDAKernel execution, specified as a scalar value.

Example: 49152

This property is read-only.

Maximum size in each dimension for thread block, specified as a vector. Each dimension of a thread block must not exceed these dimensions. Also, the product of the thread block size must not exceed MaxThreadsPerBlock.

This property is read-only.

Maximum size of grid of thread blocks, specified as a vector.

This property is read-only.

Number of simultaneously executing threads, specified as a scalar value.

This property is read-only.

Total memory (in bytes) on the device, specified as a scalar value.

This property is read-only.

Total memory (in bytes) available for data, specified as a scalar value. This property is available only for the currently selected device. This value can differ from the value reported by the NVIDIA System Management Interface due to memory caching.

Since R2023a

Caching policy of the GPU device, specified as 'balanced', 'minimum', or 'maximum'. The caching policy determines how much GPU memory can be cached to accelerate computation, specified as one of the following values.

  • 'minimum' – The amount of memory that can be cached on the GPU device is minimal.

  • 'balanced' – The amount of memory that can be cached on the GPU device is balanced. This policy provides a balance between GPU memory usage and computational performance.

  • 'maximum' – The amount of memory that can be cached on the GPU device is limited only by the total memory of the device.

The default value is 'balanced' for devices in 'Default' or 'Prohibited' compute mode and 'maximum' for devices in 'Exclusive process' compute mode. For more information on the compute mode property, see ComputeMode.

Note

  • Resetting the device using reset, clearing the device using gpuDevice([]), or selecting another device using gpuDevice resets the caching policy to the default policy.

  • Saving and loading a MAT file containing a GPUDevice object does not preserve the caching policy.

  • You cannot set the caching policy of a device that is not selected. For example, after storing a first GPUDevice object in an array and selecting another device, you cannot set the caching policy of the first GPUDevice object.

This property is read-only.

The number of streaming multiprocessors present on the device, specified as a scalar value.

This property is read-only.

Peak clock rate of the GPU in kHz, specified as a scalar value.

This property is read-only.

The compute mode of the device, specified as one of the following values.

'Default'The device is not restricted, and multiple applications can use it simultaneously. MATLAB can share the device with other applications, including other MATLAB sessions or workers.
'Exclusive process'Only one application at a time can use the device. While the device is selected in MATLAB, other applications cannot use it, including other MATLAB sessions or workers.
'Prohibited'The device cannot be used.

For more information changing the compute mode of your GPU device, consult the NVIDIA documentation.

This property is read-only.

Flag for support for overlapped transfers, specified as the logical values 0 or 1.

This property is read-only.

Flag for timeout for long-running kernels, specified as the logical values 0 or 1. If 1, the operating system places an upper bound on the time allowed for the CUDA kernel to execute. After this time, the CUDA driver times out the kernel and returns an error.

This property is read-only.

Flag for support for mapping host memory into the CUDA address space, specified as the logical values 0 or 1.

This property is read-only.

Flag for supported device, specified by the logical values 0 or 1. Not all devices are supported; for example, devices with insufficient ComputeCapability.

This property is read-only.

Flag for available device, specified by the logical values 0 or 1. This property indicates whether the device is available for use in the current MATLAB session. Unsupported devices with a DeviceSupported property of 0 are always unavailable. A device can also be unavailable if its ComputeMode property is set to 'Exclusive thread', 'Exclusive process', or 'Prohibited'.

This property is read-only.

Flag for currently selected device, specified by the logical values 0 or 1.

Since R2024a

This property is read-only.

Universally unique identifier, specified as a character array.

Each UUID typically starts with 'GPU-' followed by 36-character hexadecimal sequence. You can use the UUID to distinguish otherwise identical GPUs.

The software does not display this property when you call gpuDevice. Query this property using dot notation or using gpuDeviceTable.

Data Types: char

Object Functions

resetReset GPU device and clear its memory
wait (GPUDevice)Wait for GPU calculation to complete

The following functions are also available:

parallel.gpu.GPUDevice.isAvailable(indx)Returns logical 1 or true if the GPU specified by index indx is supported and capable of being selected. indx can be an integer or a vector of integers; the default index is the current device.
parallel.gpu.GPUDevice.getDevice(indx)Returns a GPUDevice object without selecting it.

For a complete list of functions, use the methods function on the GPUDevice object:

methods('parallel.gpu.GPUDevice')

You can get help on any of the object functions with the following command:

help parallel.gpu.GPUDevice.functionname

where functionname is the name of the function. For example, to get help on isAvailable, type:

help parallel.gpu.GPUDevice.isAvailable

Examples

collapse all

This example shows how to use gpuDevice to identify and select which device you want to use.

To determine how many GPU devices are available in your computer, use the gpuDeviceCount function.

gpuDeviceCount("available")
ans = 2

When there are multiple devices, the first is the default. You can examine its properties with the gpuDeviceTable function to determine if that is the one you want to use.

gpuDeviceTable
ans=2×5 table
    Index           Name           ComputeCapability    DeviceAvailable    DeviceSelected
    _____    __________________    _________________    _______________    ______________

      1      "NVIDIA RTX A5000"          "8.6"               true              false     
      2      "Quadro P620"               "6.1"               true              true      

If the first device is the device you want to use, you can proceed. To run computations on the GPU, use gpuArray enabled functions. For more information, see Run MATLAB Functions on a GPU.

To use another device, call gpuDevice with the index of the other device.

gpuDevice(2)
ans = 
  CUDADevice with properties:

                      Name: 'Quadro P620'
                     Index: 2
         ComputeCapability: '6.1'
            SupportsDouble: 1
     GraphicsDriverVersion: '511.79'
               DriverModel: 'WDDM'
            ToolkitVersion: 11.2000
        MaxThreadsPerBlock: 1024
          MaxShmemPerBlock: 49152 (49.15 KB)
        MaxThreadBlockSize: [1024 1024 64]
               MaxGridSize: [2.1475e+09 65535 65535]
                 SIMDWidth: 32
               TotalMemory: 2147287040 (2.15 GB)
           AvailableMemory: 1615209678 (1.62 GB)
               CachePolicy: 'balanced'
       MultiprocessorCount: 4
              ClockRateKHz: 0
               ComputeMode: 'Default'
      GPUOverlapsTransfers: 1
    KernelExecutionTimeout: 1
          CanMapHostMemory: 1
           DeviceSupported: 1
           DeviceAvailable: 1
            DeviceSelected: 1

Alternatively, you can determine how many GPU devices are available, inspect some of their properties, and select a device to use from the MATLAB® desktop. On the Home tab, in the Environment area, select Parallel > Select GPU Environment.

gpuDevice2.png

Create an object representing the default GPU device and query its compute capability.

D = gpuDevice;
D.ComputeCapability
ans = 
'8.6'

Query the compute capabilities of all available GPU devices.

for idx = 1:gpuDeviceCount
    D = gpuDevice(idx);
    fprintf(1,"Device %i has ComputeCapability %s \n", ...
        D.Index,D.ComputeCapability)
end
Device 1 has ComputeCapability 8.6 
Device 2 has ComputeCapability 6.1 

Compare the compute capabilities and availability of the GPU devices in your system using gpuDeviceTable.

gpuDeviceTable
ans=2×5 table
    Index           Name           ComputeCapability    DeviceAvailable    DeviceSelected
    _____    __________________    _________________    _______________    ______________

      1      "NVIDIA RTX A5000"          "8.6"               true              false     
      2      "Quadro P620"               "6.1"               true              true      

Change the caching policy of your GPU.

Create an object representing the default GPU device.

D = gpuDevice
D = 
  CUDADevice with properties:

                      Name: 'NVIDIA RTX A5000'
                     Index: 1
         ComputeCapability: '8.6'
            SupportsDouble: 1
     GraphicsDriverVersion: '511.79'
               DriverModel: 'TCC'
            ToolkitVersion: 11.2000
        MaxThreadsPerBlock: 1024
          MaxShmemPerBlock: 49152 (49.15 KB)
        MaxThreadBlockSize: [1024 1024 64]
               MaxGridSize: [2.1475e+09 65535 65535]
                 SIMDWidth: 32
               TotalMemory: 25553076224 (25.55 GB)
           AvailableMemory: 25145376768 (25.15 GB)
               CachePolicy: 'balanced'
       MultiprocessorCount: 64
              ClockRateKHz: 0
               ComputeMode: 'Default'
      GPUOverlapsTransfers: 1
    KernelExecutionTimeout: 0
          CanMapHostMemory: 1
           DeviceSupported: 1
           DeviceAvailable: 1
            DeviceSelected: 1

Access the CachePolicy property of the GPU device.

D.CachePolicy
ans = 
'balanced'

Change the caching policy to allow the GPU to cache the maximum amount of memory for accelerating computation.

D.CachePolicy = "maximum";
D.CachePolicy
ans = 
'maximum'

Reset the caching policy to the default policy by setting the property to [].

D.CachePolicy = [];

Calling reset(D) or selecting another device with gpuDevice also resets the caching policy to its default value.

If you have access to several GPUs, you can perform your calculations on multiple GPUs in parallel using a parallel pool.

To determine the number of GPUs that are available for use in MATLAB, use the gpuDeviceCount function.

availableGPUs = gpuDeviceCount("available")
availableGPUs = 3

Start a parallel pool with as many workers as available GPUs. For best performance, MATLAB assigns a different GPU to each worker by default.

parpool("Processes",availableGPUs);
Starting parallel pool (parpool) using the 'Processes' profile ...
Connected to the parallel pool (number of workers: 3).

To identify which GPU each worker is using, call gpuDevice inside an spmd block. The spmd block runs gpuDevice on every worker.

spmd
    gpuDevice
end

Use parallel language features, such as parfor or parfeval, to distribute your computations to workers in the parallel pool. If you use gpuArray enabled functions in your computations, these functions run on the GPU of the worker. For more information, see Run MATLAB Functions on a GPU. For an example, see Run MATLAB Functions on Multiple GPUs.

When you are done with your computations, shut down the parallel pool. You can use the gcp function to obtain the current parallel pool.

delete(gcp("nocreate"));

If you want to use a different choice of GPUs, then you can use gpuDevice to select a particular GPU on each worker, using the GPU device index. You can obtain the index of each GPU device in your system using the gpuDeviceCount function.

Suppose you have three GPUs available in your system, but you want to use only two for a computation. Obtain the indices of the devices.

[availableGPUs,gpuIndx] = gpuDeviceCount("available")
availableGPUs = 3
gpuIndx = 1×3

     1     2     3

Define the indices of the devices you want to use.

useGPUs = [1 3];

Start your parallel pool. Use an spmd block and gpuDevice to associate each worker with one of the GPUs you want to use, using the device index. The spmdIndex function identifies the index of each worker.

parpool("Processes",numel(useGPUs));
Starting parallel pool (parpool) using the 'Processes' profile ...
Connected to the parallel pool (number of workers: 2).
spmd
    gpuDevice(useGPUs(spmdIndex));
end

As a best practice, and for best performance, assign a different GPU to each worker.

When you are done with your computations, shut down the parallel pool.

delete(gcp("nocreate"));

Extended Capabilities

Version History

Introduced in R2010b

expand all