Documentation

# classificationLayer

Classification output layer

## Syntax

``layer = classificationLayer``
``layer = classificationLayer(Name,Value)``

## Description

A classification layer computes the cross entropy loss for multi-class classification problems with mutually exclusive classes.

The layer infers the number of classes from the output size of the previous layer. For example, to specify the number of classes K of the network, include a fully connected layer with output size K and a softmax layer before the classification layer.

````layer = classificationLayer` creates a classification layer.```

example

````layer = classificationLayer(Name,Value)` sets the optional `Name` and `Classes` properties using name-value pairs. For example, `classificationLayer('Name','output')` creates a classification layer with the name `'output'`. Enclose each property name in single quotes.```

## Examples

collapse all

Create a classification layer with the name `'output'`.

`layer = classificationLayer('Name','output')`
```layer = ClassificationOutputLayer with properties: Name: 'output' Classes: 'auto' OutputSize: 'auto' Hyperparameters LossFunction: 'crossentropyex' ```

Include a classification output layer in a `Layer` array.

```layers = [ ... imageInputLayer([28 28 1]) convolution2dLayer(5,20) reluLayer maxPooling2dLayer(2,'Stride',2) fullyConnectedLayer(10) softmaxLayer classificationLayer]```
```layers = 7x1 Layer array with layers: 1 '' Image Input 28x28x1 images with 'zerocenter' normalization 2 '' Convolution 20 5x5 convolutions with stride [1 1] and padding [0 0 0 0] 3 '' ReLU ReLU 4 '' Max Pooling 2x2 max pooling with stride [2 2] and padding [0 0 0 0] 5 '' Fully Connected 10 fully connected layer 6 '' Softmax softmax 7 '' Classification Output crossentropyex ```

## Input Arguments

collapse all

### Name-Value Pair Arguments

Specify optional comma-separated pairs of `Name,Value` arguments. `Name` is the argument name and `Value` is the corresponding value. `Name` must appear inside quotes. You can specify several name and value pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.

Example: `classificationLayer('Name','output')` creates a classification layer with the name `'output'`

Layer name, specified as a character vector or a string scalar. To include a layer in a layer graph, you must specify a nonempty unique layer name. If you train a series network with the layer and `Name` is set to `''`, then the software automatically assigns a name to the layer at training time.

Data Types: `char` | `string`

Classes of the output layer, specified as a categorical vector, string array, cell array of character vectors, or `'auto'`. If `Classes` is `'auto'`, then the software automatically sets the classes at training time. If you specify the string array or cell array of character vectors `str`, then the software sets the classes of the output layer to `categorical(str,str)`. The default value is `'auto'`.

Data Types: `char` | `categorical` | `string` | `cell`

## Output Arguments

collapse all

Classification layer, returned as a `ClassificationOutputLayer` object.

For information on concatenating layers to construct convolutional neural network architecture, see `Layer`.

collapse all

### Classification Layer

A classification layer computes the cross entropy loss for multi-class classification problems with mutually exclusive classes.

For typical classification networks, the classification layer must follow the softmax layer. In the classification layer, `trainNetwork` takes the values from the softmax function and assigns each input to one of the K mutually exclusive classes using the cross entropy function for a 1-of-K coding scheme [1]:

`$\text{loss}=-\sum _{i=1}^{N}\sum _{j=1}^{K}\text{​}{\text{t}}_{ij}\mathrm{ln}{y}_{ij},$`

where N is the number of samples, K is the number of classes, ${t}_{ij}$ is the indicator that the ith sample belongs to the jth class, and ${y}_{ij}$ is the output for sample i for class j, which in this case, is the value from the softmax function. That is, it is the probability that the network associates the ith input with class j.

## References

[1] Bishop, C. M. Pattern Recognition and Machine Learning. Springer, New York, NY, 2006.