Machine learning and data normalization - how data should(?) be normalized.
3 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Hello, I have a general question about data normalization for classification algorithms: if I have a training set and a testing set, should I normalize them separately or join them for normalization step? And what if later I would like to use this classifier to classify a totally new portion of data? Should I keep extreme values of each feature to use them for normalization?
Second question I have: Is normalization really necessary? Does SVM need it?
Thank you in advance for any help. Cheers, Michael
0 Kommentare
Antworten (2)
Mostafa Nakhaei
am 18 Okt. 2019
Please note that the best practice in machine learning is to keep the distribution of testing and training the same. So, if you want to normalize your data, it is good to do the normalization on whole dataset first and then separate them. thus, your testing and training will have the same distribution. The common error is to separate the data and then normalize them individually.
0 Kommentare
BERGHOUT Tarek
am 3 Feb. 2019
1-you can normalize the eparately or together but the best way is to normalize the inside the trainig function ; if you add the normelization function inside the trainig function , you can use it for any dataset after that .
2- yes normalization alwaze necesery if and ownly if the activation fuinctions of your training model are bounded otherwise you don't have to normelize tham;
and for SVM if the kerenel function is bounded you must normelize you data.
0 Kommentare
Siehe auch
Kategorien
Mehr zu Statistics and Machine Learning Toolbox finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!