ClassificationPartitionedKernelECOC

Cross-validated kernel error-correcting output codes (ECOC) model for multiclass classification

Description

ClassificationPartitionedKernelECOC is an error-correcting output codes (ECOC) model composed of kernel classification models, trained on cross-validated folds. Estimate the quality of the classification by cross-validation using one or more “kfold” functions: kfoldPredict, kfoldLoss, kfoldMargin, and kfoldEdge.

Every “kfold” method uses models trained on training-fold (in-fold) observations to predict the response for validation-fold (out-of-fold) observations. For example, suppose that you cross-validate using five folds. In this case, the software randomly assigns each observation into five groups of equal size (roughly). The training fold contains four of the groups (that is, roughly 4/5 of the data) and the validation fold contains the other group (that is, roughly 1/5 of the data). In this case, cross-validation proceeds as follows:

The software trains the first model (stored in CVMdl.Trained{1}) by using the observations in the last four groups and reserves the observations in the first group for validation.
The software trains the second model (stored in CVMdl.Trained{2}) using the observations in the first group and the last three groups. The software reserves the observations in the second group for validation.
The software proceeds in a similar fashion for the third, fourth, and fifth models.

If you validate by using kfoldPredict, the software computes predictions for the observations in group i by using the ith model. In short, the software estimates a response for every observation by using the model trained without that observation.

Note

ClassificationPartitionedKernelECOC model objects do not store the predictor data set.

Creation

You can create a ClassificationPartitionedKernelECOC model by training an ECOC model using fitcecoc and specifying these name-value pair arguments:

'Learners'– Set the value to 'kernel', a template object returned by templateKernel, or a cell array of such template objects.
One of the arguments 'CrossVal', 'CVPartition', 'Holdout', 'KFold', or 'Leaveout'.

For more details, see fitcecoc.

Properties

expand all

Cross-Validation Properties

`CrossValidatedModel` — Cross-validated model name
character vector

This property is read-only.

Cross-validated model name, specified as a character vector.

For example, 'KernelECOC' specifies a cross-validated kernel ECOC model.

Data Types: char

`KFold` — Number of cross-validated folds
positive integer scalar

This property is read-only.

Number of cross-validated folds, specified as a positive integer scalar.

Data Types: double

`ModelParameters` — Cross-validation parameter values
object

This property is read-only.

Cross-validation parameter values, specified as an object. The parameter values correspond to the name-value pair argument values used to cross-validate the ECOC classifier. ModelParameters does not contain estimated parameters.

You can access the properties of ModelParameters using dot notation.

`NumObservations` — Number of observations
positive numeric scalar

This property is read-only.

Number of observations in the training data, specified as a positive numeric scalar.

Data Types: double

`Partition` — Data partition
`cvpartition` model

This property is read-only.

Data partition indicating how the software splits the data into cross-validation folds, specified as a cvpartition model.

`Trained` — Compact classifiers trained on cross-validation folds
cell array of `CompactClassificationECOC` models

This property is read-only.

Compact classifiers trained on cross-validation folds, specified as a cell array of CompactClassificationECOC models. Trained has k cells, where k is the number of folds.

Data Types: cell

`W` — Observation weights
numeric vector

This property is read-only.

Observation weights used to cross-validate the model, specified as a numeric vector. W has NumObservations elements.

The software normalizes the weights used for training so that sum(W,'omitnan') is 1.

Data Types: single | double

`Y` — Observed class labels
categorical array | character array | logical vector | numeric vector | cell array of character vectors

This property is read-only.

Observed class labels used to cross-validate the model, specified as a categorical or character array, logical or numeric vector, or cell array of character vectors. Y has NumObservations elements and has the same data type as the input argument Y that you pass to fitcecoc to cross-validate the model. (The software treats string arrays as cell arrays of character vectors.)

Each row of Y represents the observed classification of the corresponding row of the predictor data.

ECOC Properties

`BinaryLoss` — Binary learner loss function
`'hinge'` | `'quadratic'`

This property is read-only.

Binary learner loss function, specified as a character vector representing the loss function name.

By default, if all binary learners are kernel classification models using SVM, then BinaryLoss is 'hinge'. If all binary learners are kernel classification models using logistic regression, then BinaryLoss is 'quadratic'. To potentially increase accuracy, specify a binary loss function other than the default during a prediction or loss computation by using the BinaryLoss name-value argument of kfoldPredict or kfoldLoss.

For the list of supported binary loss functions, see Binary Loss.

Data Types: char

`BinaryY` — Binary learner class labels
numeric matrix | `[]`

This property is read-only.

Binary learner class labels, specified as a numeric matrix or [].

If the coding matrix is the same across all folds, then BinaryY is a NumObservations-by-L matrix, where L is the number of binary learners (size(CodingMatrix,2)).

The elements of BinaryY are –1, 0, and 1, and the values correspond to dichotomous class assignments. This table describes how learner j assigns observation k to a dichotomous class corresponding to the value of BinaryY(k,j).

Value	Dichotomous Class Assignment
`–1`	Learner `j` assigns observation `k` to a negative class.
`0`	Before training, learner `j` removes observation `k` from the data set.
`1`	Learner `j` assigns observation `k` to a positive class.

If the coding matrix varies across folds, then BinaryY is empty ([]).

Data Types: double

`CodingMatrix` — Codes specifying class assignments
numeric matrix | `[]`

This property is read-only.

Codes specifying class assignments for the binary learners, specified as a numeric matrix or [].

If the coding matrix is the same across all folds, then CodingMatrix is a K-by-L matrix, where K is the number of classes and L is the number of binary learners.

The elements of CodingMatrix are –1, 0, and 1, and the values correspond to dichotomous class assignments. This table describes how learner j assigns observations in class i to a dichotomous class corresponding to the value of CodingMatrix(i,j).

Value	Dichotomous Class Assignment
`–1`	Learner `j` assigns observations in class `i` to a negative class.
`0`	Before training, learner `j` removes observations in class `i` from the data set.
`1`	Learner `j` assigns observations in class `i` to a positive class.

If the coding matrix varies across folds, then CodingMatrix is empty ([]). You can obtain the coding matrix for each fold by using the Trained property. For example, CVMdl.Trained{1}.CodingMatrix is the coding matrix in the first fold of the cross-validated ECOC model CVMdl.

Data Types: double | single | int8 | int16 | int32 | int64

Other Classification Properties

`CategoricalPredictors` — Categorical predictor indices
vector of positive integers | `[]`

This property is read-only.

Categorical predictor indices, specified as a vector of positive integers. CategoricalPredictors contains index values indicating that the corresponding predictors are categorical. The index values are between 1 and p, where p is the number of predictors used to train the model. If none of the predictors are categorical, then this property is empty ([]).

Data Types: single | double

`ClassNames` — Unique class labels
categorical array | character array | logical vector | numeric vector | cell array of character vectors

This property is read-only.

Unique class labels used in training, specified as a categorical or character array, logical or numeric vector, or cell array of character vectors. ClassNames has the same data type as the observed class labels property Y and determines the class order.

`Cost` — Misclassification costs
square numeric matrix

This property is read-only.

Misclassification costs, specified as a square numeric matrix. Cost has K rows and columns, where K is the number of classes.

Cost(i,j) is the cost of classifying a point into class j if its true class is i. The order of the rows and columns of Cost corresponds to the order of the classes in ClassNames.

Data Types: double

`PredictorNames` — Predictor names
cell array of character vectors

This property is read-only.

Predictor names in order of their appearance in the predictor data, specified as a cell array of character vectors. The length of PredictorNames is equal to the number of columns used as predictor variables in the training data X or Tbl.

Data Types: cell

`Prior` — Prior class probabilities
numeric vector

This property is read-only.

Prior class probabilities, specified as a numeric vector. Prior has as many elements as there are classes in ClassNames, and the order of the elements corresponds to the elements of ClassNames.

Data Types: double

`ResponseName` — Response variable name
character vector

This property is read-only.

Response variable name, specified as a character vector.

Data Types: char

`ScoreTransform` — Score transformation function to apply to predicted scores
`'none'`

This property is read-only.

Score transformation function to apply to the predicted scores, specified as 'none'. An ECOC model does not support score transformation.

Object Functions

`kfoldEdge`	Classification edge for cross-validated kernel ECOC model
`kfoldLoss`	Classification loss for cross-validated kernel ECOC model
`kfoldMargin`	Classification margins for cross-validated kernel ECOC model
`kfoldPredict`	Classify observations in cross-validated kernel ECOC model

Examples

collapse all

Cross-Validate Multiclass Kernel Classification Model

Open Live Script

Create a cross-validated, multiclass kernel ECOC classification model using fitcecoc.

Load Fisher's iris data set. X contains flower measurements, and Y contains the names of flower species.

load fisheriris
X = meas;
Y = species;

Cross-validate a multiclass kernel ECOC classification model that can identify the species of a flower based on the flower's measurements.

rng(1); % For reproducibility
CVMdl = fitcecoc(X,Y,'Learners','kernel','CrossVal','on')

CVMdl = 
  ClassificationPartitionedKernelECOC
    CrossValidatedModel: 'KernelECOC'
           ResponseName: 'Y'
        NumObservations: 150
                  KFold: 10
              Partition: [1x1 cvpartition]
             ClassNames: {'setosa'  'versicolor'  'virginica'}
         ScoreTransform: 'none'

CVMdl is a ClassificationPartitionedKernelECOC cross-validated model. fitcecoc implements 10-fold cross-validation by default. Therefore, CVMdl.Trained contains a 10-by-1 cell array of ten CompactClassificationECOC models, one for each fold. Each compact ECOC model is composed of binary kernel classification models.

Estimate the classification error by passing CVMdl to kfoldLoss.

error = kfoldLoss(CVMdl)

error = 0.0333

The estimated classification error is about 3% misclassified observations.

To change default options when training ECOC models composed of kernel classification models, create a kernel classification model template using templateKernel, and then pass the template to fitcecoc.

Version History

Introduced in R2018b

ClassificationPartitionedKernelECOC

Description

Creation

Properties

Cross-Validation Properties

CrossValidatedModel — Cross-validated model name character vector

KFold — Number of cross-validated folds positive integer scalar

ModelParameters — Cross-validation parameter values object

NumObservations — Number of observations positive numeric scalar

Partition — Data partition cvpartition model

Trained — Compact classifiers trained on cross-validation folds cell array of CompactClassificationECOC models

W — Observation weights numeric vector

Y — Observed class labels categorical array | character array | logical vector | numeric vector | cell array of character vectors

ECOC Properties

BinaryLoss — Binary learner loss function 'hinge' | 'quadratic'

BinaryY — Binary learner class labels numeric matrix | []

CodingMatrix — Codes specifying class assignments numeric matrix | []

Other Classification Properties

CategoricalPredictors — Categorical predictor indices vector of positive integers | []

ClassNames — Unique class labels categorical array | character array | logical vector | numeric vector | cell array of character vectors

Cost — Misclassification costs square numeric matrix

PredictorNames — Predictor names cell array of character vectors

Prior — Prior class probabilities numeric vector

ResponseName — Response variable name character vector

ScoreTransform — Score transformation function to apply to predicted scores 'none'

Object Functions

Examples

Cross-Validate Multiclass Kernel Classification Model

Version History

See Also

`CrossValidatedModel` — Cross-validated model name
character vector

`KFold` — Number of cross-validated folds
positive integer scalar

`ModelParameters` — Cross-validation parameter values
object

`NumObservations` — Number of observations
positive numeric scalar

`Partition` — Data partition
`cvpartition` model

`Trained` — Compact classifiers trained on cross-validation folds
cell array of `CompactClassificationECOC` models

`W` — Observation weights
numeric vector

`Y` — Observed class labels
categorical array | character array | logical vector | numeric vector | cell array of character vectors

`BinaryLoss` — Binary learner loss function
`'hinge'` | `'quadratic'`

`BinaryY` — Binary learner class labels
numeric matrix | `[]`

`CodingMatrix` — Codes specifying class assignments
numeric matrix | `[]`

`CategoricalPredictors` — Categorical predictor indices
vector of positive integers | `[]`

`ClassNames` — Unique class labels
categorical array | character array | logical vector | numeric vector | cell array of character vectors

`Cost` — Misclassification costs
square numeric matrix

`PredictorNames` — Predictor names
cell array of character vectors

`Prior` — Prior class probabilities
numeric vector

`ResponseName` — Response variable name
character vector

`ScoreTransform` — Score transformation function to apply to predicted scores
`'none'`