How to change activation function for fully connected layer in convolutional neural network?

Question

1 Stimme

I'm in the process of implementing a wavelet neural network (WNN) using the Series Network class of the neural networking toolbox v7. While executing a simple network line-by-line, I can clearly see where the fully connected layer multiplies the inputs by the appropriate weights and adds the bias, however as best I can tell there are no additional calculations performed for the activations of the fully connected layer. It was my general understanding that standard perceptrons always have an activation/transfer function and was fully expecting to see the familiar sigmoid. However, it appears that the fully connected layer, as implemented here, assumes the identity operation as the transfer function (or, equivalently, no transfer function at all).

1) Do fully connected layers use an activation function, or are the outputs simply the weighted sums of the inputs with the addition of the bias? My initial assumption is no since I see activations greater than +1 (see example code at bottom)

2) If an activation function is used, does anyone have any suggestions where I might find and/or alter the source? I have examined the FullyConnected class and definition files and the FullyConnectedGPU(HOST)Strategy, the latter of which has the actual multiplication by weight and addition of bias.

3) If I want to use a custom activation function (in this case a wavelet), is it safe for me to simply apply said transfer function following the weighting and addition of bias? For example, if I wanted to modify a FullyConnectedLayer to have a tanh activation function, for the forward pass could I simply alter the forward method as follows? (obviously changes to the backward pass and gradient determination would also be required for the full implementation):

classdef FullyConnectedGPUStrategy < nnet.internal.cnn.layer.util.ExecutionStrategy
...
function [Z, memory] = forward(~, X, weights, bias)
  Z = iForwardConvolveOrMultiply(X, weights);
  Z = Z + bias;
  Z = tanh(z);    %addition of activation function
  memory = [];
end

Example code to illustrate problem:

%Generate training data
[XTrain, YTrain] = digitTrain4DArrayData;
%Define layers
layers = [ ...
  imageInputLayer([28 28 1])
  fullyConnectedLayer(10)
  softmaxLayer()
  classificationLayer()];
%Train network using stochastic gradient descent with momentum 
options = trainingOptions('sgdm');
net = trainNetwork(XTrain, YTrain, layers, options);
%View activations of fully connected layer
%Note: When testing this I see activations greater than +1 and
%less than 0, so it can't be using tanh or sigmoid
activations(net,XTrain(:,:,:,1),2)

Note: The reason I chose to use the Series Network class used for CNNs as opposed to the generic Neural Network class is because the output of the WNN will need to act as the input to a CNN which will then be trained together as one unit.

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

Greg Heath am 18 Aug. 2018

In MATLAB Online öffnen

Before reading your question, let me state:

1. I am an engineer, not a mathematician. So, my following statements may not be as precise as some would like. However, I believe it should be perfectly clear what I am stating:

The STANDARD UNIVERSAL APPROXIMATOR single hidden layer regression net has

    1. A nonlinear hidden layer transfer function
    2. A LINEAR output layer transfer function

I'm stating this because it is obvious that some believe that, for a universal approximator, the standard output transfer function has to be nonlinear.

Of course there are additional conditions on finiteness, etc which I have omitted, but I think I have made my point.

Hope this Helps,

Greg

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Follow Question

Answer 1

Joss Knight am 28 Jun. 2017

4 Stimmen

Activations are added as a separate layer, and in R2017a there is only the RelULayer (see reluLayer).

Custom layers have not been introduced yet, so you'd have to be hacking or masking the toolbox files, but that's fine. You could take a copy of the RelULayer classes and modify them, or just edit your MATLAB install directly if you think that's safe.

10 Kommentare
8 ältere Kommentare anzeigen 8 ältere Kommentare ausblenden

Balakrishnan Rajan am 23 Aug. 2018

@wenyi I think the backprop has to be: dLdX = Z*(1-Z)*dLdZ;

Maxime Bezanilla am 14 Mär. 2019

In MATLAB Online öffnen

@wenyi Thank you for your work. It however has a few mistakes in it.

I know that the topic is old but I am sure this can help some people, so I post my code for the sigmoid layer based on the one of @wenyi and @Balakrishnan_Rajan. There was also a mistake with the "~".

 classdef sigmoidLayer < nnet.layer.Layer
    methods
        function layer = sigmoidLayer(name) 
            % Set layer name
            if nargin == 2
                layer.Name = name;
            end
            % Set layer description
            layer.Description = 'sigmoidLayer'; 
        end
        function Z = predict(layer,X)
            % Forward input data through the layer and output the result
            Z = exp(X)./(exp(X)+1);
        end
        function dLdX = backward(layer, X ,Z,dLdZ,memory)
            % Backward propagate the derivative of the loss function through 
            % the layer 
            dLdX = Z.*(1-Z) .* dLdZ;
        end
    end
 end

This is accepted by checkLayer.

Melden Sie sich an, um zu kommentieren.

How to change activation function for fully connected layer in convolutional neural network?

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

Akzeptierte Antwort

10 Kommentare
8 ältere Kommentare anzeigen 8 ältere Kommentare ausblenden

Weitere Antworten (0)

Kategorien

Produkte

Tags

Community Treasure Hunt

How to change activation function for fully connected layer in convolutional neural network?

1 Kommentar -1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

Akzeptierte Antwort

10 Kommentare 8 ältere Kommentare anzeigen 8 ältere Kommentare ausblenden

Weitere Antworten (0)

Kategorien

Produkte

Tags

Siehe auch

Community Treasure Hunt

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

10 Kommentare
8 ältere Kommentare anzeigen 8 ältere Kommentare ausblenden