Neural Network Toolbox - Backpropagation stopping criteria

2 Ansichten (letzte 30 Tage)
Haider Ali
Haider Ali am 21 Mär. 2015
Kommentiert: Greg Heath am 25 Apr. 2015
I am using Neural Network Toolbox to classify a data of 12 alarms into 9 classes with one hidden layer containing 8 neurons. I wanted to know:
  1. What equations does training algorithm traingdm use to update the weights and bias? Are these the same as given below (etta is learning rate i.e. 0.7 and alpha is momentum coefficient i.e. 0.9):
where delta_j for output layer is:
while for hidden layer it is:
These equations are taken directly from the paper attached.
2. What does the stopping criteria net.trainParam.goal mean? Which field to update if I want my stopping criteria to be mean square error equal to 0.0001? Do I need to update net.trainParam.min_grad to 0.0001 for this?
3. How are the weights being updated in traingdm? Is it batch updation (like after every epoch) or is it updation after every input pattern of every epoch?
4. I have 41 training input patterns. How many of those are use for training process and how many for recall process. What if I want all 41 of them to be used only for training process?
5. I have tried the following code but the outputs are not being classified accurately.
clear all; close all; clc;
p = [
1 0 0 0 0 0 0 0 0 0 0 0; ... %c1
1 0 1 0 0 0 0 0 0 0 0 0; ...
1 0 1 1 0 0 0 0 0 0 0 0; ...
1 0 1 0 1 0 0 0 0 0 0 0; ...
1 0 1 0 0 0 0 0 0 1 0 0; ...
1 0 1 1 1 0 0 0 0 0 0 0; ...
1 0 1 0 1 1 0 0 0 1 0 0; ...
1 0 1 0 1 0 0 0 0 1 0 0; ...
1 0 1 1 0 0 0 0 0 1 0 0; ...
1 0 1 0 1 1 1 0 0 0 0 0; ...
1 0 1 0 1 1 0 1 0 0 0 0; ...
1 0 1 1 1 0 0 0 0 1 0 0; ...
0 1 0 0 0 0 0 0 0 0 0 0; ... %c2
0 0 0 0 0 0 0 0 0 0 0 0; ...
0 0 0 1 0 0 0 0 0 0 0 0; ...
0 0 0 0 1 0 0 0 0 0 0 0; ...
0 0 0 0 0 0 0 0 0 1 0 0; ...
0 0 0 1 1 0 0 0 0 0 0 0; ...
0 0 0 0 1 1 0 0 0 1 0 0; ...
0 0 0 0 1 0 0 0 0 1 0 0; ...
0 0 0 1 0 0 0 0 0 1 0 0; ...
0 0 0 0 1 1 1 0 0 0 0 0; ...
0 0 0 0 1 1 0 1 0 0 0 0; ...
0 0 0 1 1 0 0 0 0 1 0 0; ...
0 0 0 1 0 0 0 0 0 0 0 0; ... %c3
0 0 0 0 1 0 0 0 0 0 0 0; ... %c4 or c5
0 0 0 0 1 1 0 0 0 0 0 0; ...
0 0 0 0 1 1 1 0 0 0 0 0; ...
0 0 0 0 1 1 0 1 0 0 0 0; ...
0 0 0 0 0 1 0 0 0 0 0 0; ... %c6
0 0 0 0 0 1 1 0 0 0 0 0; ...
0 0 0 0 0 1 0 1 0 0 0 0; ...
0 0 0 0 0 0 0 1 0 0 0 0; ... %c7
0 0 0 0 0 0 0 0 1 0 0 0; ... %c8
0 0 0 0 0 0 0 0 0 0 1 0; ...
0 0 0 0 0 0 0 0 1 1 0 0; ...
0 0 0 0 0 0 0 0 0 0 1 1; ...
0 0 0 0 0 0 0 0 1 0 1 0; ...
0 0 0 0 0 0 0 0 0 0 0 1; ... %c9
0 0 1 0 0 0 0 0 0 0 0 0; ... %c1 or c2
0 0 0 0 0 0 0 0 0 1 0 0; ... %c1 or c2 or c3
]';
t = [
1 0 0 0 0 0 0 0 0; ...
1 0 0 0 0 0 0 0 0; ...
1 0 0 0 0 0 0 0 0; ...
1 0 0 0 0 0 0 0 0; ...
1 0 0 0 0 0 0 0 0; ...
1 0 0 0 0 0 0 0 0; ...
1 0 0 0 0 0 0 0 0; ...
1 0 0 0 0 0 0 0 0; ...
1 0 0 0 0 0 0 0 0; ...
1 0 0 0 0 0 0 0 0;...
1 0 0 0 0 0 0 0 0; ...
1 0 0 0 0 0 0 0 0; ...
0 1 0 0 0 0 0 0 0; ... %c2
0 1 0 0 0 0 0 0 0; ...
0 1 0 0 0 0 0 0 0; ...
0 1 0 0 0 0 0 0 0; ...
0 1 0 0 0 0 0 0 0; ...
0 1 0 0 0 0 0 0 0; ...
0 1 0 0 0 0 0 0 0; ...
0 1 0 0 0 0 0 0 0; ...
0 1 0 0 0 0 0 0 0; ...
0 1 0 0 0 0 0 0 0; ...
0 1 0 0 0 0 0 0 0; ...
0 1 0 0 0 0 0 0 0; ...
0 0 1 0 0 0 0 0 0; ... %c3
0 0 0 1 1 0 0 0 0; ... %c4 or c5
0 0 0 1 1 0 0 0 0; ...
0 0 0 1 1 0 0 0 0; ...
0 0 0 1 1 0 0 0 0; ...
0 0 0 0 0 1 0 0 0; ... %c6
0 0 0 0 0 1 0 0 0; ...
0 0 0 0 0 1 0 0 0; ...
0 0 0 0 0 0 1 0 0; ... %c7
0 0 0 0 0 0 0 1 0; ... %c8
0 0 0 0 0 0 0 1 0; ...
0 0 0 0 0 0 0 1 0; ...
0 0 0 0 0 0 0 1 0; ...
0 0 0 0 0 0 0 1 0; ...
0 0 0 0 0 0 0 0 1; ... %c9
1 1 0 0 0 0 0 0 0; ... %c1 or c2
1 1 1 0 0 0 0 0 0; ... %c1 or c2 or c3
]';
net = feedforwardnet(8,'traingdm'); %8 hidden layers and training algorithm
net = configure(net,p,t);
net.layers{2}.transferFcn = 'logsig'; %sigmoid function in output layer
net.layers{1}.transferFcn = 'logsig'; %sigmiod fucntion in hidden layer
net.performFcn = 'mse';
net = init(net);
net.trainParam.epochs = 100000; %no. of epochs are not my concern hence a large number
net.trainParam.lr = 0.7; %obtained from the paper attached
net.trainParam.mc = 0.9; %obtained from the paper attached
net.trainParam.max_fail = 100000;
net.trainParam.min_grad = 0.00015; %is this stopping criteria same as mse?
net = train(net,p,t);
view(net);
Let me know if something else needs to be specified. Regards.
  1 Kommentar
Greg Heath
Greg Heath am 25 Apr. 2015
% Target columns should sum to 1
% If targets are mutually exclusive there is only one "1"
% init(net) unecessary because of configure
NO MITIGATION FOR OVERTRAINING AN OVERFIT NET
1. max_epoch is HUGE
2. msegoal not specified ==> default of 0
3. no validation stopping
4. no regularization (trainbr)
Hope this helps.
Greg

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Greg Heath
Greg Heath am 21 Mär. 2015
If you are going to use MATLAB, I suggest using as many defaults as possible.
1. Use PATTERNNET for classification
2. To see the default settings, type into the command line WITHOUT AN ENDING SEMICOLON
net = patternnet % default H = 10
3. If a vector can belong to m of c classes,
a. The c-dimensional unit target vector should contain
i. m positive components that sum to 1
ii. c-m components of value 0
4. Typically, the only things that need to be varied are
a. H, the number of hidden nodes
b. The initial weights
5. The best way to do this is
a. Initialize the RNG
b. Use an outer loop over number of hidden nodes
c. Use an inner loop over random weight initializations
d. For example
Ntrials = 10
rng('default')
j=0
for h = Hmin:dH:Hmax
j=j+1
...
for i = 1:Ntrials
...
end
end
6. See the NEWSGROUP and ANSWERS for examples. Search with
greg patternnet Ntrials
Hope this helps
Thank you for formally accepting my answer
Greg
  8 Kommentare
Haider Ali
Haider Ali am 30 Mär. 2015
Hi Greg,
There is only one hidden layer containing 8 neurons. The author has not mentioned the train/validate/test ratio.
I am now using the Iris Data Set to train my NN using Back Propagation (just for my own understanding and testing). The code is below:
clear all;
close all;
clc;
p = [
5.1,3.5,1.4,0.2; %iris data set
4.9,3.0,1.4,0.2;
4.7,3.2,1.3,0.2;
4.6,3.1,1.5,0.2;
5.0,3.6,1.4,0.2;
5.4,3.9,1.7,0.4;
4.6,3.4,1.4,0.3;
5.0,3.4,1.5,0.2;
4.4,2.9,1.4,0.2;
4.9,3.1,1.5,0.1;
5.4,3.7,1.5,0.2;
4.8,3.4,1.6,0.2;
4.8,3.0,1.4,0.1;
4.3,3.0,1.1,0.1;
5.8,4.0,1.2,0.2;
5.7,4.4,1.5,0.4;
5.4,3.9,1.3,0.4;
5.1,3.5,1.4,0.3;
5.7,3.8,1.7,0.3;
5.1,3.8,1.5,0.3;
5.4,3.4,1.7,0.2;
5.1,3.7,1.5,0.4;
4.6,3.6,1.0,0.2;
5.1,3.3,1.7,0.5;
4.8,3.4,1.9,0.2;
5.0,3.0,1.6,0.2;
5.0,3.4,1.6,0.4;
5.2,3.5,1.5,0.2;
5.2,3.4,1.4,0.2;
4.7,3.2,1.6,0.2;
4.8,3.1,1.6,0.2;
5.4,3.4,1.5,0.4;
5.2,4.1,1.5,0.1;
5.5,4.2,1.4,0.2;
4.9,3.1,1.5,0.1;
5.0,3.2,1.2,0.2;
5.5,3.5,1.3,0.2;
4.9,3.1,1.5,0.1;
4.4,3.0,1.3,0.2;
5.1,3.4,1.5,0.2;
5.0,3.5,1.3,0.3;
4.5,2.3,1.3,0.3;
4.4,3.2,1.3,0.2;
5.0,3.5,1.6,0.6;
5.1,3.8,1.9,0.4;
4.8,3.0,1.4,0.3;
5.1,3.8,1.6,0.2;
4.6,3.2,1.4,0.2;
5.3,3.7,1.5,0.2;
5.0,3.3,1.4,0.2;
7.0,3.2,4.7,1.4;
6.4,3.2,4.5,1.5;
6.9,3.1,4.9,1.5;
5.5,2.3,4.0,1.3;
6.5,2.8,4.6,1.5;
5.7,2.8,4.5,1.3;
6.3,3.3,4.7,1.6;
4.9,2.4,3.3,1.0;
6.6,2.9,4.6,1.3;
5.2,2.7,3.9,1.4;
5.0,2.0,3.5,1.0;
5.9,3.0,4.2,1.5;
6.0,2.2,4.0,1.0;
6.1,2.9,4.7,1.4;
5.6,2.9,3.6,1.3;
6.7,3.1,4.4,1.4;
5.6,3.0,4.5,1.5;
5.8,2.7,4.1,1.0;
6.2,2.2,4.5,1.5;
5.6,2.5,3.9,1.1;
5.9,3.2,4.8,1.8;
6.1,2.8,4.0,1.3;
6.3,2.5,4.9,1.5;
6.1,2.8,4.7,1.2;
6.4,2.9,4.3,1.3;
6.6,3.0,4.4,1.4;
6.8,2.8,4.8,1.4;
6.7,3.0,5.0,1.7;
6.0,2.9,4.5,1.5;
5.7,2.6,3.5,1.0;
5.5,2.4,3.8,1.1;
5.5,2.4,3.7,1.0;
5.8,2.7,3.9,1.2;
6.0,2.7,5.1,1.6;
5.4,3.0,4.5,1.5;
6.0,3.4,4.5,1.6;
6.7,3.1,4.7,1.5;
6.3,2.3,4.4,1.3;
5.6,3.0,4.1,1.3;
5.5,2.5,4.0,1.3;
5.5,2.6,4.4,1.2;
6.1,3.0,4.6,1.4;
5.8,2.6,4.0,1.2;
5.0,2.3,3.3,1.0;
5.6,2.7,4.2,1.3;
5.7,3.0,4.2,1.2;
5.7,2.9,4.2,1.3;
6.2,2.9,4.3,1.3;
5.1,2.5,3.0,1.1;
5.7,2.8,4.1,1.3;
6.3,3.3,6.0,2.5;
5.8,2.7,5.1,1.9;
7.1,3.0,5.9,2.1;
6.3,2.9,5.6,1.8;
6.5,3.0,5.8,2.2;
7.6,3.0,6.6,2.1;
4.9,2.5,4.5,1.7;
7.3,2.9,6.3,1.8;
6.7,2.5,5.8,1.8;
7.2,3.6,6.1,2.5;
6.5,3.2,5.1,2.0;
6.4,2.7,5.3,1.9;
6.8,3.0,5.5,2.1;
5.7,2.5,5.0,2.0;
5.8,2.8,5.1,2.4;
6.4,3.2,5.3,2.3;
6.5,3.0,5.5,1.8;
7.7,3.8,6.7,2.2;
7.7,2.6,6.9,2.3;
6.0,2.2,5.0,1.5;
6.9,3.2,5.7,2.3;
5.6,2.8,4.9,2.0;
7.7,2.8,6.7,2.0;
6.3,2.7,4.9,1.8;
6.7,3.3,5.7,2.1;
7.2,3.2,6.0,1.8;
6.2,2.8,4.8,1.8;
6.1,3.0,4.9,1.8;
6.4,2.8,5.6,2.1;
7.2,3.0,5.8,1.6;
7.4,2.8,6.1,1.9;
7.9,3.8,6.4,2.0;
6.4,2.8,5.6,2.2;
6.3,2.8,5.1,1.5;
6.1,2.6,5.6,1.4;
7.7,3.0,6.1,2.3;
6.3,3.4,5.6,2.4;
6.4,3.1,5.5,1.8;
6.0,3.0,4.8,1.8;
6.9,3.1,5.4,2.1;
6.7,3.1,5.6,2.4;
6.9,3.1,5.1,2.3;
5.8,2.7,5.1,1.9;
6.8,3.2,5.9,2.3;
6.7,3.3,5.7,2.5;
6.7,3.0,5.2,2.3;
6.3,2.5,5.0,1.9;
6.5,3.0,5.2,2.0;
6.2,3.4,5.4,2.3;
5.9,3.0,5.1,1.8;
]';
t = [
0; %assign 0 to output neuron for Iris-setosa
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0;
0.5; %assign 0.5 to output neuron for Iris-versicolor
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;
1; %assign 1 to output neuron for Iris-virginica
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
1;
]';
net = feedforwardnet(3,'traingd'); %3 hidden layers and training algorithm
net = configure(net,p,t);
net.layers{2}.transferFcn = 'logsig'; %sigmoid function in output layer
net.layers{1}.transferFcn = 'logsig'; %sigmiod fucntion in hidden layer
net.performFcn = 'mse';
net = init(net);
net.trainParam.epochs = 10000;
net.trainParam.lr = 0.7; %learning rate
net.trainParam.goal = 0.01; %mse
net = train(net,p,t);
view(net);
The problem is that I am not getting the desired output for the first class (for which the output should be close to zero). When I input a vector from the first class to the trained net, the output is close to 0.5 (but it should be close to zero).
This is the output for the first vector of the first class:
output = net([5.1,3.5,1.4,0.2]')
output =
0.5003
This output should be close to zero (because I have assigned 0 to first class), but it is coming out to be 0.5. This is the case for all the inputs of first class. For the second and third class, the outputs are fine i.e. close to 0.5 for class 2 and close to 1.0 for class 3.
Can you please run this code and tell me what I am doing wrong?
(I think it might be issue of the bias input because all the outputs for class 1 are being offset by 0.5.)
Regards.
Greg Heath
Greg Heath am 25 Apr. 2015
%GEH1: LOUSY TARGET CODING
%GEH2: traingd instead of traingdm
% GEH3: Logsig output INVALID for default mapmaxmin [-1 1 ] scaling
Hope this helps
Greg

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Sequence and Numeric Feature Data Workflows finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by