Data replication Neural Networks Matlab

Question

0 Stimmen

Hello world.! I have recently been studying neural networks, so I may ask something obvious, but I figured out that when I replicate my inputs and outputs and then train the network for pattern recognition,it has far more accuracy than with the original data. I thought of that in order to replicate some of the extreme values I have. Can that make my network overfit? Thank you everyone

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Follow Question

Answer 1

Greg Heath am 10 Apr. 2016

Bearbeitet: Greg Heath am 19 Apr. 2016

In MATLAB Online öffnen

0 Stimmen

1. I don't understand your question.

2. a. OVERFITTING means there are more unknown weights, Nw, than independent training equations, Ntrneq ( i.e., Nw > Ntrneq).

b. OVERTRAINING an overfit net CAN LEAD to loss of performance on NONTRAINING data.

3. There are several remedies to prevent OVERTRAINING AN OVERFIT NET. So, in general, overfitting need not be disastrous.

4. Methods for preventing loss of generalization via overtraining an overfit net

 a. Do not overfit: Nw < Ntrneq. Preferrably, 
Ntrneq >> Nw which yields design Stability and 
robustness w.r.t. noise and measurement error.
 For example:
    i. Increase the number of training examples
    ii. Reduce the number of hidden nodes 
 b. Use VALIDATION STOPPING to prevent overtraining
 c. Use the BAYESIAN REGULARIZATION 
training function TRAINBR with MSEREG 
    as a default.
 d. Replace the default performance function 
MSE with the regularized 
    modification MSEREG

Hope this helps.

Thank you for formally accepting my answer

Greg

5 Kommentare
3 ältere Kommentare anzeigen 3 ältere Kommentare ausblenden

Greg Heath am 12 Apr. 2016

In MATLAB Online öffnen

If you are doing it correctly, duplicating the data should basically yield the same result.

If you ask " What is correctly?"

I'll answer:

 " Do duplicated training examples remain 
training examples? 
Similarly for the val and test examples."

I have the sneaky feeling that some of the data is being used as both training and nontraining data:

i.e.,

The equivalent of using 'dividetrain'!

Obviously, you need to duplicate each of the division subsets separately.

Sometimes when I don't have enough original data, I use the following:

 1. Duplicate each subset individually. 
   a. Loop over multiple trials of
      adding random noise to the  
      duplicated data for training, 
      validation and testing.
   b. Rank the designs based on  
      val performance.
   c. Compare the test data results with 
the performance of the original data.

When I have enough data I use a double loop approach (I have posted zillions of examples)

Ntrials = 10
Hmin = ...
dH = ...
Hmax = ...

%Outer Loop over # of hidden nodes:

j=0
rng('defaullt')
for h = Hmin:dH:Hmax 
    j=j+1
   net = ...
   blah, blah, blah
%Inner Loop over different random 
   initial weights
     for i=1:Ntrials
         net = configure(...)
         blah, blah
     end
end

Hope this helps.

Greg

Greg Heath am 19 Apr. 2016

No.

You add the random noise to the replicated data. Just make sure that the resulting signal to noise ratio is sufficiently high.

Obviously, one good way to approach the problem is as a function of RNG state (which determines the initial random weights) and SNR, given the number of hidden nodes.

However, I have done the reverse, i.e., trained and validated with noisy duplicated data and then used the original data for testing. Again results are presented in terms of the parameter SNR.

Hope this helps.

Greg

Andreas am 19 Apr. 2016

Dear Dr. Heath,

thank you very much for your time.

Best regards,

Andreas Kampianakis

Melden Sie sich an, um zu kommentieren.

Data replication Neural Networks Matlab

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Akzeptierte Antwort

5 Kommentare
3 ältere Kommentare anzeigen 3 ältere Kommentare ausblenden

Weitere Antworten (0)

Kategorien

Produkte

Tags

Community Treasure Hunt

Data replication Neural Networks Matlab

0 Kommentare -2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Akzeptierte Antwort

5 Kommentare 3 ältere Kommentare anzeigen 3 ältere Kommentare ausblenden

Weitere Antworten (0)

Kategorien

Produkte

Tags

Siehe auch

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

5 Kommentare
3 ältere Kommentare anzeigen 3 ältere Kommentare ausblenden