How to select the number of samples to train a Machine Learning algorithm?
1 Ansicht (letzte 30 Tage)
Ältere Kommentare anzeigen
I working in a dataset of 12000 samples concerning about 5 years of an industrial process.
It is likely that during this time the plant has undergone changes (equipments, the performance drop itself, chemical products).
Is there a tool for identifying the best subset of this data? In my view, a temporal cut in the data could increase the quality of the models created.
3 Kommentare
Greg Heath
am 4 Feb. 2019
As a common sense rule of thumb I try to use at least 10 to 30 times as many training points as unknown parameters that have to be estimated.
In addition I use 10 to 20 sets of random initial weights.
I assume , of course, that you ave examined plots of the data to initialize your common sense.
Hope this Helps
Greg
Antworten (1)
BERGHOUT Tarek
am 3 Feb. 2019
u can use deep belif networks ; they are the best for feature sellection and mapping; and train you network by driven chunks of data "by randomly chosing a pairs of (inputs,targets)" and in the same time pire attention to your approximation function you must keep your error function in its local minimam. deep belif nets depands on a set of stacked auto_encoders that allows to tune all the parameters of the networks with small amount of training data
0 Kommentare
Siehe auch
Kategorien
Mehr zu Function Approximation and Clustering finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!