Choosing the Best Machine Learning Classification Model and Avoiding Overfitting

Chapter 2

Refining Your Model

MATLAB® removes a lot of the hassle of figuring out which model works best with its Classification Learner app. You can also use MATLAB to judge how well your model is performing, and verify your results.

Using Classification Learner, you can perform common machine learning tasks such as interactively exploring your data, selecting features, specifying validation schemes, training multiple models in parallel, and assessing results.

With the app you can:

  • Assess classifier performance using confusion matrices, ROC curves, or scatter plots
  • Compare model accuracy using the misclassification rate on the validation set
  • Improve model accuracy with advanced options and feature selection
  • Export the best model to the workspace to make predictions on new data
  • Generate MATLAB code to train classifiers on new data

Statistics and Machine Learning Toolbox Apps

The Classification Learner App in MATLAB shows different machine learning models fit to a data set.  A scatter plot shows which points were classified correctly and which points were classified incorrectly.

Find the Classification Learner app, along with the apps for your other installed products, in the Apps tab in the MATLAB Toolstrip.

Classification Learner App Supports:

Generic graphic of a discriminant analysis model.

Discriminant analysis classifiers

Linear, quadratic

Generic graphic of a decision tree.

Decision trees

Fine tree, medium tree, coarse tree

Generic graphic of a support vector machine.

Support vector machines

Linear SVM, fine Gaussian SVM, medium Gaussian SVM, coarse Gaussian SVM, quadratic SVM, cubic SVM, linear, quadratic

Generic graphic of a k nearest neighbor.

Nearest neighbor classifiers

Fine KNN, medium KNN, coarse KNN, cosine KNN, cubic KNN, weighted KNN

Generic graphic of a naive Bayes model.

Naive Bayes classifiers

Gaussian NB, Kernel NB

Generic graphic of bagged and boosted decision trees.

Ensemble classifiers

Boosted trees (AdaBoost, RUSBoost), bagged trees, subspace KNN, subspace discriminant

Generic graphic of a neural network.

Neural Network classifiers

Narrow, Medium, Wide, Bilayered, Trilayered

Generic graphic of a discriminant analysis model.

Kernel Approximation classifiers

SVM Kernel, Logistic Regression

Generic graphic of a logistic regression.

Logistic regression classifier

Screenshot of Classification Learner app.

See how the Classification Learner app works