Compact support vector machine (SVM) for one-class and binary classification
CompactClassificationSVM
is a compact version of the
support vector machine (SVM) classifier. The compact classifier does not include the
data used for training the SVM classifier. Therefore, you cannot perform some tasks,
such as cross-validation, using the compact classifier. Use a compact SVM classifier for
tasks such as predicting the labels of new data.
Create a CompactClassificationSVM
model from a full, trained
ClassificationSVM
classifier by using
compact
.
Alpha
— Trained classifier coefficientsThis property is read-only.
Trained classifier coefficients, specified as an s-by-1 numeric
vector. s is the number of support vectors in the trained classifier,
sum(Mdl.IsSupportVector)
.
Alpha
contains the trained classifier coefficients from the dual
problem, that is, the estimated Lagrange multipliers. If you remove duplicates by using
the RemoveDuplicates
name-value pair argument of fitcsvm
, then for a given set of
duplicate observations that are support vectors, Alpha
contains one
coefficient corresponding to the entire set. That is, MATLAB® attributes a nonzero coefficient to one observation from the set of
duplicates and a coefficient of 0
to all other duplicate observations
in the set.
Data Types: single
| double
Beta
— Linear predictor coefficientsThis property is read-only.
Linear predictor coefficients, specified as a numeric vector. The length of
Beta
is equal to the number of predictors used to train the
model.
MATLAB expands categorical variables in the predictor data using full dummy
encoding. That is, MATLAB creates one dummy variable for each level of each categorical variable.
Beta
stores one value for each predictor variable, including the
dummy variables. For example, if there are three predictors, one of which is a
categorical variable with three levels, then Beta
is a numeric vector
containing five values.
If KernelParameters.Function
is 'linear'
, then
the classification score for the observation x is
Mdl
stores β,
b, and s in the properties
Beta
, Bias
, and
KernelParameters.Scale
, respectively.
To estimate classification scores manually, you must first apply any transformations
to the predictor data that were applied during training. Specifically, if you specify
'Standardize',true
when using fitcsvm
, then
you must standardize the predictor data manually by using the mean
Mdl.Mu
and standard deviation Mdl.Sigma
, and
then divide the result by the kernel scale in
Mdl.KernelParameters.Scale
.
All SVM functions, such as resubPredict
and predict
, apply any required transformation before estimation.
If KernelParameters.Function
is not 'linear'
,
then Beta
is empty ([]
).
Data Types: single
| double
Bias
— Bias termThis property is read-only.
Bias term, specified as a scalar.
Data Types: single
| double
KernelParameters
— Kernel parametersThis property is read-only.
Kernel parameters, specified as a structure array. The kernel parameters property contains the fields listed in this table.
Field | Description |
---|---|
Function | Kernel function used to compute the elements of the Gram
matrix. For details, see |
Scale | Kernel scale parameter used to scale all elements of the
predictor data on which the model is trained. For details, see
|
To display the values of KernelParameters
, use dot notation. For
example, Mdl.KernelParameters.Scale
displays the kernel scale
parameter value.
The software accepts KernelParameters
as inputs and does not modify
them.
Data Types: struct
SupportVectorLabels
— Support vector class labelsThis property is read-only.
Support vector class labels, specified as an s-by-1 numeric vector.
s is the number of support vectors in the trained classifier,
sum(Mdl.IsSupportVector)
.
A value of +1
in SupportVectorLabels
indicates
that the corresponding support vector is in the positive class
(ClassNames{2}
). A value of –1
indicates that
the corresponding support vector is in the negative class
(ClassNames{1}
).
If you remove duplicates by using the RemoveDuplicates
name-value pair argument of fitcsvm
, then for a given set of
duplicate observations that are support vectors, SupportVectorLabels
contains one unique support vector label.
Data Types: single
| double
SupportVectors
— Support vectorsThis property is read-only.
Support vectors in the trained classifier, specified as an
s-by-p numeric matrix. s is
the number of support vectors in the trained classifier,
sum(Mdl.IsSupportVector)
, and p is the number
of predictor variables in the predictor data.
SupportVectors
contains rows of the predictor data
X
that MATLAB considers to be support vectors. If you specify
'Standardize',true
when training the SVM classifier using
fitcsvm
, then SupportVectors
contains the
standardized rows of X
.
If you remove duplicates by using the RemoveDuplicates
name-value pair argument of fitcsvm
, then for a given set of
duplicate observations that are support vectors, SupportVectors
contains one unique support vector.
Data Types: single
| double
CategoricalPredictors
— Categorical predictor indices[]
This property is read-only.
Categorical predictor
indices, specified as a vector of positive integers. CategoricalPredictors
contains index values corresponding to the columns of the predictor data that contain
categorical predictors. If none of the predictors are categorical, then this property is empty
([]
).
Data Types: single
| double
ClassNames
— Unique class labelsThis property is read-only.
Unique class labels used in training the model, specified as a categorical or character array, logical or numeric vector, or cell array of character vectors.
Data Types: single
| double
| logical
| char
| cell
| categorical
Cost
— Misclassification costThis property is read-only.
Misclassification cost, specified as a numeric square matrix, where
Cost(i,j)
is the cost of classifying a point into class
j
if its true class is i
.
During training, the software updates the prior probabilities by incorporating the penalties described in the cost matrix.
For two-class learning, Cost
always has this form:
Cost(i,j) = 1
if i ~= j
, and
Cost(i,j) = 0
if i = j
. The rows
correspond to the true class and the columns correspond to the predicted
class. The order of the rows and columns of Cost
corresponds to the order of the classes in
ClassNames
.
For one-class learning, Cost = 0
.
For more details, see Algorithms.
Data Types: double
ExpandedPredictorNames
— Expanded predictor namesThis property is read-only.
Expanded predictor names, specified as a cell array of character vectors.
If the model uses dummy variable encoding for categorical variables, then
ExpandedPredictorNames
includes the names that describe the
expanded variables. Otherwise, ExpandedPredictorNames
is the same as
PredictorNames
.
Data Types: cell
Mu
— Predictor means[]
This property is read-only.
Predictor means, specified as a numeric vector. If you specify
'Standardize',1
or 'Standardize',true
when you
train an SVM classifier using fitcsvm
, then the length of
Mu
is equal to the number of predictors.
MATLAB expands categorical variables in the predictor data using full dummy
encoding. That is, MATLAB creates one dummy variable for each level of each categorical variable.
Mu
stores one value for each predictor variable, including the
dummy variables. However, MATLAB does not standardize the columns that contain categorical
variables.
If you set 'Standardize',false
when you train the SVM classifier
using fitcsvm
, then Mu
is an empty vector
([]
).
Data Types: single
| double
PredictorNames
— Predictor variable namesThis property is read-only.
Predictor variable names, specified as a cell array of character vectors. The order of the elements of PredictorNames
corresponds to the order in which the predictor names appear in the training data.
Data Types: cell
Prior
— Prior probabilitiesThis property is read-only.
Prior probabilities for each class, specified as a numeric vector. The order of the
elements of Prior
corresponds to the elements of
Mdl.ClassNames
.
For two-class learning, if you specify a cost matrix, then the software updates the prior probabilities by incorporating the penalties described in the cost matrix.
For more details, see Algorithms.
Data Types: single
| double
ScoreTransform
— Score transformationScore transformation, specified as a character vector or function handle.
ScoreTransform
represents a built-in transformation function or a
function handle for transforming predicted classification scores.
To change the score transformation function to function
, for
example, use dot notation.
For a built-in function, enter a character vector.
Mdl.ScoreTransform = 'function';
This table describes the available built-in functions.
Value | Description |
---|---|
'doublelogit' | 1/(1 + e–2x) |
'invlogit' | log(x / (1 – x)) |
'ismax' | Sets the score for the class with the largest score to 1 , and sets the
scores for all other classes to 0 |
'logit' | 1/(1 + e–x) |
'none' or 'identity' | x (no transformation) |
'sign' | –1 for x < 0 0 for x = 0 1 for x > 0 |
'symmetric' | 2x – 1 |
'symmetricismax' | Sets the score for the class with the largest score to 1 ,
and sets the scores for all other classes to –1 |
'symmetriclogit' | 2/(1 + e–x) – 1 |
For a MATLAB function or a function that you define, enter its function handle.
Mdl.ScoreTransform = @function;
function
should accept a matrix (the original
scores) and return a matrix of the same size (the transformed scores).
Data Types: char
| function_handle
Sigma
— Predictor standard deviations[]
(default) | numeric vectorThis property is read-only.
Predictor standard deviations, specified as a numeric vector.
If you specify 'Standardize',true
when you train the SVM classifier
using fitcsvm
, then the length of Sigma
is equal
to the number of predictor variables.
MATLAB expands categorical variables in the predictor data using full dummy
encoding. That is, MATLAB creates one dummy variable for each level of each categorical variable.
Sigma
stores one value for each predictor variable, including the
dummy variables. However, MATLAB does not standardize the columns that contain categorical
variables.
If you set 'Standardize',false
when you train the SVM classifier
using fitcsvm
, then Sigma
is an empty vector
([]
).
Data Types: single
| double
compareHoldout | Compare accuracies of two classification models using new data |
discardSupportVectors | Discard support vectors for linear support vector machine (SVM) classifier |
edge | Find classification edge for support vector machine (SVM) classifier |
fitPosterior | Fit posterior probabilities for compact support vector machine (SVM) classifier |
loss | Find classification error for support vector machine (SVM) classifier |
margin | Find classification margins for support vector machine (SVM) classifier |
predict | Classify observations using support vector machine (SVM) classifier |
update | Update model parameters for code generation |
Reduce the size of a full SVM classifier by removing the training data. Full SVM classifiers (that is, ClassificationSVM
classifiers) hold the training data. To improve efficiency, use a smaller classifier.
Load the ionosphere
data set.
load ionosphere
Train an SVM classifier. Standardize the predictor data and specify the order of the classes.
SVMModel = fitcsvm(X,Y,'Standardize',true,... 'ClassNames',{'b','g'})
SVMModel = ClassificationSVM ResponseName: 'Y' CategoricalPredictors: [] ClassNames: {'b' 'g'} ScoreTransform: 'none' NumObservations: 351 Alpha: [90x1 double] Bias: -0.1342 KernelParameters: [1x1 struct] Mu: [1x34 double] Sigma: [1x34 double] BoxConstraints: [351x1 double] ConvergenceInfo: [1x1 struct] IsSupportVector: [351x1 logical] Solver: 'SMO' Properties, Methods
SVMModel
is a ClassificationSVM
classifier.
Reduce the size of the SVM classifier.
CompactSVMModel = compact(SVMModel)
CompactSVMModel = classreg.learning.classif.CompactClassificationSVM ResponseName: 'Y' CategoricalPredictors: [] ClassNames: {'b' 'g'} ScoreTransform: 'none' Alpha: [90x1 double] Bias: -0.1342 KernelParameters: [1x1 struct] Mu: [1x34 double] Sigma: [1x34 double] SupportVectors: [90x34 double] SupportVectorLabels: [90x1 double] Properties, Methods
CompactSVMModel
is a CompactClassificationSVM
classifier.
Display the amount of memory each classifier uses.
whos('SVMModel','CompactSVMModel')
Name Size Bytes Class Attributes CompactSVMModel 1x1 31074 classreg.learning.classif.CompactClassificationSVM SVMModel 1x1 141404 ClassificationSVM
The full SVM classifier (SVMModel
) is more than four times larger than the compact SVM classifier (CompactSVMModel
).
To label new observations efficiently, you can remove SVMModel
from the MATLAB® Workspace, and then pass CompactSVMModel
and new predictor values to predict
.
To further reduce the size of your compact SVM classifier, use the discardSupportVectors
function to discard support vectors.
Load the ionosphere
data set.
load ionosphere
Train and cross-validate an SVM classifier. Standardize the predictor data and specify the order of the classes.
rng(1); % For reproducibility CVSVMModel = fitcsvm(X,Y,'Standardize',true,... 'ClassNames',{'b','g'},'CrossVal','on')
CVSVMModel = classreg.learning.partition.ClassificationPartitionedModel CrossValidatedModel: 'SVM' PredictorNames: {1x34 cell} ResponseName: 'Y' NumObservations: 351 KFold: 10 Partition: [1x1 cvpartition] ClassNames: {'b' 'g'} ScoreTransform: 'none' Properties, Methods
CVSVMModel
is a ClassificationPartitionedModel
cross-validated SVM classifier. By default, the software implements 10-fold cross-validation.
Alternatively, you can cross-validate a trained ClassificationSVM
classifier by passing it to crossval
.
Inspect one of the trained folds using dot notation.
CVSVMModel.Trained{1}
ans = classreg.learning.classif.CompactClassificationSVM ResponseName: 'Y' CategoricalPredictors: [] ClassNames: {'b' 'g'} ScoreTransform: 'none' Alpha: [78x1 double] Bias: -0.2209 KernelParameters: [1x1 struct] Mu: [1x34 double] Sigma: [1x34 double] SupportVectors: [78x34 double] SupportVectorLabels: [78x1 double] Properties, Methods
Each fold is a CompactClassificationSVM
classifier trained on 90% of the data.
Estimate the generalization error.
genError = kfoldLoss(CVSVMModel)
genError = 0.1168
On average, the generalization error is approximately 12%.
[1] Hastie, T., R. Tibshirani, and J. Friedman. The Elements of Statistical Learning, Second Edition. NY: Springer, 2008.
[2] Scholkopf, B., J. C. Platt, J. C. Shawe-Taylor, A. J. Smola, and R. C. Williamson. “Estimating the Support of a High-Dimensional Distribution.” Neural Computation. Vol. 13, Number 7, 2001, pp. 1443–1471.
[3] Christianini, N., and J. C. Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge, UK: Cambridge University Press, 2000.
[4] Scholkopf, B., and A. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond, Adaptive Computation and Machine Learning. Cambridge, MA: The MIT Press, 2002.
Usage notes and limitations:
When you train an SVM model by using fitcsvm
, the following restrictions apply.
The class labels input argument value (Y
) cannot be a
categorical array.
Code generation does
not support categorical predictors. You cannot supply training data in a table that contains a
logical vector, character array, categorical array, string array, or cell array of character
vectors. Also, you cannot use the 'CategoricalPredictors'
name-value pair
argument.
To include categorical predictors in a model, preprocess the
categorical predictors by using dummyvar
before fitting the model.
The value of the 'ClassNames'
name-value pair argument cannot be a
categorical array.
The value of the 'ScoreTransform'
name-value pair argument cannot be
an anonymous function. For generating code that predicts posterior
probabilities given new observations, pass a trained SVM model to
fitPosterior
or fitSVMPosterior
. The
ScoreTransform
property of the returned model
contains an anonymous function that represents the
score-to-posterior-probability function and is configured for code
generation.
For fixed-point code generation, the value of the 'ScoreTransform'
name-value pair argument must be
'none'
(default), 'ismax'
,
'sign'
, 'symmetric'
, or
'symmetricismax'
.
For more information, see Introduction to Code Generation.
ClassificationSVM
| compact
| discardSupportVectors
| fitcsvm
A modified version of this example exists on your system. Do you want to open this version instead?
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
Select web siteYou can also select a web site from the following list:
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.