sequentialfs

Sequential feature selection using custom criterion

Syntax

tf = sequentialfs(fun,X,y)

tf = sequentialfs(fun,X1,...,XN)

tf = sequentialfs(___,Name,Value)

[tf,history] = sequentialfs(___)

Description

tf = sequentialfs(fun,X,y) selects a subset of features in X that are important for predicting y. The function defines a random nonstratified partition for 10-fold cross-validation using X and y, and then sequentially selects features based on the cross-validate prediction criterion values computed by the fun function. The initial feature set includes no features. sequentialfs adds one feature to the set at each iteration, until adding a feature does not decrease the criterion value by greater than the termination tolerance value. The output tf is a logical vector that indicates the selected features. For more details, see Algorithms.

example

tf = sequentialfs(fun,X1,...,XN) selects a subset of features in X1 by cross-validating the criterion value on the partition defined for X1,...,XN.

example

tf = sequentialfs(___,Name,Value) specifies options using one or more name-value arguments in addition to any of the input argument combinations in the previous syntaxes. For example, specify "Direction","backward" to perform recursive feature elimination (RFE). The initial feature set includes all features. sequentialfs removes one feature from the set at each iteration, until removing a feature does not decrease the prediction criterion.

example

[tf,history] = sequentialfs(___) also returns information about the feature selection process.

example

Examples

collapse all

Forward Feature Selection

Open Live Script

Find important features by performing forward sequential feature selection using the wrapper type.

Load the fisheriris data set.

load fisheriris

Display the variables in the data set.

whos

  Name           Size            Bytes  Class     Attributes

  meas         150x4              4800  double              
  species      150x1             19300  cell

The matrix meas contains four measurements from three species of iris flowers for 150 different flowers. The variable species lists the species for each flower.

Specify the predictor data X and the response data y. Define X to include the four measurements and six random variables. Place the measurement variables in columns 1, 3, 5, and 7.

rng("default") % For reproducibility
X = randn(150,10);
X(:,[1 3 5 7])= meas;
y = species;

Define the function handle myfun for an anonymous function that takes four inputs: training data (XTrain and yTrain) and test data (XTest and yTest). The anonymous function trains a classification model by using the training data, and returns a loss value on the test data for the trained model.

myfun = @(XTrain,yTrain,XTest,yTest) ...
  size(XTest,1)*loss(fitcecoc(XTrain,yTrain),XTest,yTest);

The loss function of a classification model object returns an average loss value, but sequentialfs also divides the sum of the criterion values returned by myfun by the total number of test observations. Therefore, the anonymous function must return the loss value multiplied by the number of test observations.

Create a random partition for stratified 10-fold cross-validation.

cv = cvpartition(y,"KFold",10);

Use the sequentialfs function to sequentially select important features in X based on the criterion value returned by myfun. Specify to use the stratified partition cv, and set the iteration option to display information about the feature selection process at each iteration.

opts = statset("Display","iter");
tf = sequentialfs(myfun,X,y,"CV",cv,"Options",opts);

Start forward sequential feature selection:
Initial columns included:  none
Columns that can not be included:  none
Step 1, added column 7, criterion value 0.04
Step 2, added column 5, criterion value 0.0333333
Step 3, added column 1, criterion value 0.0266667
Step 4, added column 3, criterion value 0.0133333
Final columns included:  1 3 5 7

sequentialfs correctly finds the important predictors in columns 1, 3, 5, and 7.

Backward Feature Selection

Open Live Script

Find important features by performing backward sequential feature selection, or recursive feature elimination (RFE), using the wrapper type.

Load the hald data set, which measures the effect of cement composition on its hardening heat.

load hald

This data set includes the variables ingredients and heat. The matrix ingredients contains the percent composition of four chemicals present in the cement. The vector heat contains the values for the heat hardening after 180 days for each cement sample.

Use the sequentialfs function to perform backward sequential feature selection based on the criterion value returned by myfun. The code for the helper function myfun appears at the end of this example. Specify the Direction name-value argument as "backward" to include all features in the initial feature set and then sequentially exclude one feature at each iteration. Set the iteration option to display information about the feature selection process at each iteration.

rng("default") % For reproducibility
opts = statset("Display","iter");
tf = sequentialfs(@myfun,ingredients,heat, ...
    "Direction","backward","Options",opts);

Start backward sequential feature selection:
Initial columns included:  all
Columns that must be included:  none
Step 1, used initial columns, criterion value 12.4989
Step 2, removed column 3, criterion value 6.25866
Final columns included:  1 2 4

sequentialfs excludes the third variable from the features in ingredients.

Helper Function

The myfun function takes four inputs: training data (XTrain and yTrain) and test data (XTest and yTest). The function trains a regression model by using the training data, and returns the sum of squared errors on the test data for the trained model.

function criterion = myfun(XTrain,yTrain,XTest,yTest)
    mdl = fitrlinear(XTrain,yTrain);
    predictedYTest = predict(mdl,XTest);
    e = yTest - predictedYTest;
    criterion = e'*e;
end

Filter Type Feature Selection

Open Live Script

Perform filter type feature selection based on the correlation coefficients for the features.

Load the carsmall data set.

load carsmall

Create the feature matrix X containing six variables.

X = [Acceleration Cylinders Displacement ...
    Horsepower Model_Year Weight];

Compute the matrix of the pairwise linear correlation coefficients between each pair of features in X by using the corr function. Specify the Rows name-value argument as "pairwise" to omit any rows containing NaN on a pairwise basis for each two-column correlation coefficient calculation.

corr(X,"Rows","pairwise")

ans = 6×6

    1.0000   -0.6473   -0.6947   -0.6968    0.4843   -0.4879
   -0.6473    1.0000    0.9512    0.8622   -0.6053    0.8844
   -0.6947    0.9512    1.0000    0.9134   -0.5779    0.8895
   -0.6968    0.8622    0.9134    1.0000   -0.6082    0.8733
    0.4843   -0.6053   -0.5779   -0.6082    1.0000   -0.4964
   -0.4879    0.8844    0.8895    0.8733   -0.4964    1.0000

X contains highly correlated features. For example, the correlation between the second and third features (Cylinders and Displacement) is 0.9512.

Use the sequentialfs function to rank the features in X based on the correlation values. Specify these options when you call the sequentialfs function:

Use the helper function mycorr, which returns the maximum absolute value of the off-diagonal elements in the matrix of correlation coefficients. The code for this helper function appears at the end of this example.
Specify "Direction","backward" and "NullModel",true so that sequentialfs starts from the initial feature set containing all features and then excludes all features from the set, one feature at a time.
Specify "CV","none" to perform feature selection without cross-validation.
Set the iteration option to display information about the feature selection process at each iteration.

opts = statset("Display","iter");
[~,history] = sequentialfs(@mycorr,X, ...
    "Direction","backward","NullModel",true, ...
    "CV","none","Options",opts);

Start backward sequential feature selection:
Initial columns included:  all
Columns that must be included:  none
Step 1, used initial columns, criterion value 0.951167
Step 2, removed column 3, criterion value 0.884401
Step 3, removed column 6, criterion value 0.862164
Step 4, removed column 4, criterion value 0.647346
Step 5, removed column 2, criterion value 0.484253
Step 6, removed column 1, criterion value 0
Step 7, removed column 5, criterion value 0
Final columns included:  none

sequentialfs returns the structure array history with two fields (In and Crit) containing information about the feature selection process. The In field contains a logical matrix where row i indicates the features selected at iteration i. A true (logical 1) entry in a row indicates that the corresponding feature is in the feature set after the iteration.

history.In

ans = 7×6 logical array

   1   1   1   1   1   1
   1   1   0   1   1   1
   1   1   0   1   1   0
   1   1   0   0   1   0
   1   0   0   0   1   0
   0   0   0   0   1   0
   0   0   0   0   0   0

The Crit field contains the criterion values computed at each iteration.

history.Crit

ans = 1×7

    0.9512    0.8844    0.8622    0.6473    0.4843         0         0

The last two criterion values are zero because the mycorr function returns 0 if the input contains fewer than two features.

Extract the indices of the excluded features from the matrix in the In field.

p = size(X,2);
idx = NaN(1,p);
for i = 1 : p
    idx(i) = find(history.In(i,:)~=history.In(i+1,:));
end
idx

idx = 1×6

     3     6     4     2     1     5

Find the set of features whose criterion value is less than 0.8.

threshold = 0.8;
iter_last_exclude = find(history.Crit(2:end)<threshold,1);
idx_selected = idx(iter_last_exclude+1:end)

idx_selected = 1×3

     2     1     5

Compute the correlation coefficient matrix for the selected features.

corr(X(:,idx_selected),"Rows","pairwise")

ans = 3×3

    1.0000   -0.6473   -0.6053
   -0.6473    1.0000    0.4843
   -0.6053    0.4843    1.0000

The absolute values of the off-diagonal elements are less than the threshold value 0.8.

Helper Function

The mycorr function takes a matrix that contains features in columns, and returns the maximum absolute value of the off-diagonal elements in the matrix of correlation coefficients. The off-diagonal elements are the correlations between two distinct features in the input data. Therefore, mycorr returns zero if the input data does not have at least two distinct features.

function criterion = mycorr(X)
    if size(X,2) < 2
        criterion = 0;
    else
        p = size(X,2);
        R = corr(X,"Rows","pairwise");
        R(logical(eye(p))) = NaN;
        criterion = max(abs(R),[],"all");
    end
end

Select Features in Table

Open Live Script

Convert a table that contains both numeric and categorical variables to an array by using the onehotencode and table2array functions. Then, select important features in the array by using the sequentialfs function.

Load the carbig data set.

load carbig

This data set contains variables that describe several aspects of cars, such as miles per gallon (MPG), country of origin (Origin), and number of cylinders (Cylinders). You can create a regression model of MPG using the other variables.

Specify the predictor data tblX in a table, and specify the response data y.

tblX = table(Acceleration,Cylinders,Displacement, ...
    Horsepower,Model_Year,Weight,Origin);
y = MPG;

All variables in tblX are numeric except the Origin variable.

One-hot encode the Origin variable by using the onehotencode function.

tblOrigin = table(categorical(string(Origin)));
tblOrigin = onehotencode(tblOrigin);

Remove the Origin variable from tblX, and add the encoded values to tblX.

tblX.Origin = [];
tblX = [tblX tblOrigin];

Convert the table tblX to an array.

X = table2array(tblX);

Define the function handle myfun for an anonymous function that takes four inputs: training data (XTrain and yTrain) and test data (XTest and yTest). The anonymous function trains a regression model by using the training data, and returns a loss value on the test data for the trained model.

myfun = @(XTrain,yTrain,XTest,yTest) ...
  size(XTest,1)*loss(fitrtree(XTrain,yTrain),XTest,yTest);

The loss function of a regression model object returns the mean squared error (MSE), but sequentialfs also divides the sum of the criterion values returned by myfun by the total number of test observations. Therefore, the anonymous function must return the loss value multiplied by the number of test observations.

Use the sequentialfs function to sequentially select important features in X based on the criterion value returned by myfun.

rng("default") % For reproducibility
tf = sequentialfs(myfun,X,y);

Display the variable names of the selected features.

tblX.Properties.VariableNames(tf)'

ans = 6×1 cell
    {'Cylinders'   }
    {'Displacement'}
    {'Model_Year'  }
    {'Weight'      }
    {'Germany'     }
    {'Italy'       }

Input Arguments

collapse all

`fun` — Function to compute feature selection criterion
function handle

Function to compute the feature selection criterion, specified as a function handle.

For each candidate feature set, sequentialfs computes the cross-validated criterion value by repeatedly calling the fun function as follows:

For each fold (a group of training and test data sets) defined by the CV name-value argument, sequentialfs calls the fun function to get the criterion value for the fold.
sequentialfs divides the sum of the criterion values by the total number of test observations.

If you specify X and y, then the fun function must have this form:

criterion = fun(XTrain,yTrain,XTest,yTest)

The fun function accepts the training data (XTrain and yTrain) and test data (XTest and yTest).
XTrain and XTest contain a subset of the columns of X that corresponds to the current candidate feature set.
The fun function returns a scalar value criterion.
Typically, fun trains a model by using the training data (XTrain, yTrain), predicts response values for XTest, and returns a loss of the predicted values compared to yTest. Common loss measures include the sum of squared errors for regression models and the number of misclassified observations for classification models.
For example, you can define the myFun function as follows, and then specify fun as @myFun.
```
function criterion = myFun(XTrain,yTrain,XTest,yTest)
  mdl = fitcsvm(XTrain,yTrain);
  predictedYTest = predict(mdl,XTest);
  criterion = sum(~strcmp(yTest,predictedYTest));
end
```
Alternatively, you can define the function handle myFunHandle for an anonymous function as follows, and then specify fun as myFunHandle.
```
myFunHandle = @(XTrain,yTrain,XTest,yTest) ...
  loss(fitcsvm(XTrain,yTrain),XTest,yTest)*size(XTest,1);
```
sequentialfs divides the sum of the criterion values returned by fun by the total number of test observations. So, fun must not divide the loss value by the number of test observations. The loss function of a classification or regression object returns an averaged loss value. Therefore, fun must return the loss value multiplied by the number of test observations. If you define the fun function to return the sum of squared errors or the number of misclassified observations, then the cross-validated criterion value is the mean squared error or the misclassification rate, respectively.

If you specify X1,...,XN, sequentialfs selects features from X1 only, but otherwise imposes no interpretation on X1,...,XN. The function fun still must have this form:

criterion = fun(X1Train,⋯,XNTrain,X1Test,⋯,XNTest)

The fun function accepts the training data (X1Train,…,XNTrain) and test data (X1Test,…,XNTest).
X1Train and X1Test contain a subset of the columns of X1 that corresponds to the current candidate feature set.
The fun function returns a scalar value criterion.

Data Types: function_handle

`X` — Feature data
numeric matrix

Feature data, specified as a numeric matrix. The rows of X correspond to observations, and the columns of X correspond to features. X and y must have the same number of rows.

The custom function defined by the fun argument must accept a group of training and test data sets defined by splitting X. For details, see the fun argument and CV name-value argument.

Data Types: single | double

`y` — Responses (labels)
column vector

Responses (labels), specified as a column vector. X and y must have the same number of rows.

The custom function defined by the fun argument must accept a group of training and test data sets defined by splitting y. For details, see the fun argument and CV name-value argument.

`X1,...,XN` — Input data
matrices

Input data, specified as matrices. The matrices must have the same number of rows.

sequentialfs selects features from X1 only, but otherwise imposes no interpretation on X1,...,XN.

The custom function defined by the fun argument must accept a group of training and test data sets defined by splitting X1,...,XN. For details, see the fun argument and CV name-value argument.

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: KeepIn=[1 0 0 0],KeepOut=[0 0 0 1] always includes the first feature and excludes the last feature.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: "KeepIn",[1 0 0 0],"KeepOut",[0 0 0 1]

`CV` — Cross-validation option
10 (default) | positive integer | `cvpartition` object | `"resubstitution"` | `"none"`

Cross-validation option to compute the criterion for each candidate feature subset, specified as a positive integer, cvpartition object, "resubstitution", or "none".

For each candidate feature subset, sequentialfs uses the partition specified by this argument to cross-validate the criterion value returned by the fun function.

Positive integer k — sequentialfs uses a random nonstratified partition for k-fold cross-validation.
cvpartition object — sequentialfs uses a partition specified in the cvpartition object. You can specify a stratified partition, a partition for holdout validation, or a partition for leave-one-out cross-validation. For details, see cvpartition.
"resubstitution" — sequentialfs does not partition the input data. Both the training set and the test set contain all of the original observations. For example, if you specify X and y, then sequentialfs calls fun as criterion = fun(X,y,X,y).
"none" — sequentialfs does not validate the criterion value and calls fun as criterion = fun(X,y), without separating the training and test sets.

Example: "CV","none"

`MCReps` — Number of Monte Carlo repetitions for cross-validation
`1` (default) | positive integer

Number of Monte Carlo repetitions for cross-validation, specified as a positive integer.

If you specify a positive integer greater than 1, sequentialfs repeats the cross-validation computation for the specified number of repetitions for each candidate feature subset.

If CV is "none", "resubstitution", a cvpartition object of type "resubstitution", a cvpartition object of type "leaveout", or a custom cvpartition object (with the IsCustom property set to 1), then the software sets the MCReps value to 1.

Example: "MCReps",10

Data Types: single | double

`Direction` — Direction of sequential search
`"forward"` (default) | `"backward"`

Direction of the sequential search, specified as "forward" or "backward".

"forward" — The initial feature set includes no features, and the sequentialfs function sequentially adds features to the set.
"backward" — The initial feature set includes all features, and the sequentialfs function sequentially removes features from the set. That is, the sequentialfs function performs recursive feature elimination (RFE).

Example: "Direction","backward"

Data Types: char | string

`KeepIn` — Features to include
`[]` (default) | logical vector | vector of positive integers

Features to include, specified as [], a logical vector, or a vector of positive integers.

By default, sequentialfs examines all features for the feature selection process. If you specify features to include using this argument, sequentialfs always includes the features in the candidate feature sets. A true entry in a logical vector or an index value in a vector of positive integers indicates that the output argument tf must include the corresponding feature.

Example: "KeepIn",[1 0 0 0]

Data Types: logical

`KeepOut` — Features to exclude
`[]` (default) | logical vector | vector of positive integers

Features to exclude, specified as [], a logical vector, or a vector of positive integers.

By default, sequentialfs examines all features for the feature selection process. If you specify features to exclude using this argument, sequentialfs excludes the features from the candidate feature sets. A true entry in a logical vector or an index value in a vector of positive integers indicates that the output argument tf must exclude the corresponding feature.

Example: "KeepOut",[0 0 0 1]

Data Types: logical

`NFeatures` — Number of features to select
`[]` (default) | positive integer

Number of features to select, specified as [] or a positive integer.

By default, sequentialfs stops iterations when the function satisfies one of the stopping criteria (MaxIter or TolFun) specified by the Options name-value argument. If you specify the NFeatures name-value argument as a positive integer, sequentialfs stops iterations after selecting the specified number of features. This argument overrides other iteration options.

Example: "NFeatures",2

Data Types: single | double

`NullModel` — Flag to include null model
`false` or `0` (default) | `true` or `1`

Flag to include the null model (model containing no features), specified as a logical 1 (true) or 0 (false).

If you specify true, the sequentialfs function includes the null model as a valid option for the output tf and computes the criterion value for the empty input data. Therefore, the fun function must be able to accept empty matrices as input argument values.

Example: "NullModel",true

Data Types: logical

`Options` — Options for iterations and parallel computation
`statset("sequentialfs")` (default) | structure returned by `statset`

Options for the iterations and parallel computation, specified as a structure returned by statset.

This table lists the option fields and their values.

Field Name	Field Value	Default Value
`Display`	Level of display, specified as `"off"`, `"final"`, or `"iter"`. `"off"` — Display no information. `"final"` — Display the final information. `"iter"` — Display information at each iteration.	`"off"`
`MaxIter`	Maximum number of iterations allowed, specified as a positive integer	`Inf`
`TolFun`	Termination tolerance on the criterion value, specified as a positive scalar	`1e-6` if `Direction` is `"forward"`; `0` if `Direction` is `"backward"`
`TolTypeFun`	Type of the termination tolerance for the criterion value, specified as `"abs"` (absolute tolerance) or `"rel"` (relative tolerance)	`"rel"`
`UseParallel`	Flag to run in parallel, specified as logical `1` (`true`) or `0` (`false`)	`false`
`UseSubstreams`	Flag to run computations in a reproducible manner, specified as logical `1` (`true`) or `0` (`false`). To compute reproducibly, set `Streams` to a type that allows substreams: `"mlfg6331_64"` or `"mrg32k3a"`.	`false`
`Streams`	Random number streams, specified as a `RandStream` object or cell array of such objects. Use a single object except when the `UseParallel` value is `true` and the `UseSubstreams` value is `false`. In that case, use a cell array that has the same size as the parallel pool.	MATLAB^® default random number stream

To compute in parallel, you need Parallel Computing Toolbox™.

Example: "Options",statset("Display","iter")

Data Types: struct

Output Arguments

collapse all

`tf` — Selected features
logical vector

Selected features, returned as a logical vector. A true (logical 1) entry indicates that the corresponding feature is selected.

`history` — History of feature selection process
structure

History of the feature selection process, returned as a structure array including the In and Crit fields.

In is a logical matrix in which row i indicates the features selected at iteration i.
Crit is a vector containing the criterion values computed at each iteration.

More About

collapse all

Feature Selection

Feature selection reduces the dimensionality of data by selecting only a subset of measured features (predictor variables) to create a model. Feature selection algorithms search for a subset of predictors that optimally models measured responses, subject to constraints such as required or excluded features and the size of the subset.

You can categorize feature selection algorithms into three types:

Filter type — The filter type feature selection algorithm measures feature importance based on the characteristics of the features, such as feature variance and feature relevance to the response. You select important features as part of a data preprocessing step and then train a model using the selected features. Therefore, filter type feature selection is uncorrelated to the training algorithm.
Wrapper type — The wrapper type feature selection algorithm starts training using a subset of features and then adds or removes a feature using a selection criterion. The selection criterion directly measures the change in model performance that results from adding or removing a feature. The algorithm repeats training and improving a model until its stopping criteria are satisfied.
Embedded type — The embedded type feature selection algorithm learns feature importance as part of the model learning process. Once you train a model, you obtain the importance of the features in the trained model. This type of algorithm selects features that work well with a particular learning process.

For more details, see Introduction to Feature Selection.

Algorithms

sequentialfs sequentially selects features in X by performing these steps:

Define a random nonstratified partition for 10-fold cross-validation on n observations, where n is the number of observations in X.
Initialize the selected feature set S as an empty set.
For each feature x_i in X, compute the cross-validated criterion value using the fun function.
Add the feature with the smallest criterion value to S.
For each feature x_i in X\S, define a candidate feature set C_i as S∪{x_i}. Compute the cross-validated criterion value using fun for C_i.
Among the candidate sets (C_is), select the set that reduces the criterion value the most, compared to the criterion value for S. Add the feature corresponding to the selected candidate set to S.
Repeat steps 5 and 6 until adding a feature does not decrease the criterion value by greater than the termination tolerance value.

To customize the feature selection process, use the name-value arguments of sequentialfs.

You can specify cross-validation options by using the CV and MCReps name-value arguments.
- For wrapper type feature selection, specify the arguments to cross-validate the criterion value for each candidate feature set. You can define the fun function to train a model and return a criterion value for the trained model. For an example, see Forward Feature Selection.
- For filter type feature selection, which does not involve cross-validation, specify CV as "none" and use the fun function to measure characteristics of the input data, such as correlation. For an example, see Filter Type Feature Selection.
To perform backward feature selection, or recursive feature elimination (RFE), specify the Direction name-value argument as "backward". sequentialfs initializes the selected feature set S as a set with all features, and then removes one feature at a time from the set.
You can specify which features to always include or exclude, the number of features in the final selected feature set, and whether to consider a model with no features as a valid option. For details, see the KeepIn, KeepOut, NFeatures, and NullModel name-value arguments.
Use the Options name-value argument to specify options for the iterations and parallel computation. For example, Options,statset("TolFun",1e-2) sets the iteration termination tolerance on the criterion value to 1e-2.

Extended Capabilities

expand all

Automatic Parallel Support
Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.

To run in parallel, specify the Options name-value argument in the call to this function and set the UseParallel field of the options structure to true using statset:

Options=statset(UseParallel=true)

For more information about parallel computing, see Run MATLAB Functions with Automatic Parallel Support (Parallel Computing Toolbox).

Version History

Introduced in R2008a

expand all

R2026a: Compute the criterion value for each candidate feature set in parallel (requires Parallel Computing Toolbox)

Compute the criterion value for each candidate feature set in parallel when no cross-validation is performed. In previous releases, sequentialfs performs only cross-validation in parallel. That is, the function runs computations in parallel when both of the following are true:

The UseParallel field of the options structure (Options) is set to true.
The cross-validation option (CV) uses more than one test set, or the number of Monte Carlo repetitions (MCReps) is greater than 1.

Starting with this release, provided that UseParallel=true in the options structure, the function computes the criterion value for each candidate feature set in parallel when no cross-validation is requested or when cross-validation is requested with one Monte Carlo repetition.

sequentialfs

Syntax

Description

Examples

Forward Feature Selection

Backward Feature Selection

Filter Type Feature Selection

Select Features in Table

Input Arguments

`fun` — Function to compute feature selection criterion
function handle

`X` — Feature data
numeric matrix

`y` — Responses (labels)
column vector

`X1,...,XN` — Input data
matrices

Name-Value Arguments

`CV` — Cross-validation option
10 (default) | positive integer | `cvpartition` object | `"resubstitution"` | `"none"`

`MCReps` — Number of Monte Carlo repetitions for cross-validation
`1` (default) | positive integer

`Direction` — Direction of sequential search
`"forward"` (default) | `"backward"`

`KeepIn` — Features to include
`[]` (default) | logical vector | vector of positive integers

`KeepOut` — Features to exclude
`[]` (default) | logical vector | vector of positive integers

`NFeatures` — Number of features to select
`[]` (default) | positive integer

`NullModel` — Flag to include null model
`false` or `0` (default) | `true` or `1`

`Options` — Options for iterations and parallel computation
`statset("sequentialfs")` (default) | structure returned by `statset`

Output Arguments

`tf` — Selected features
logical vector

`history` — History of feature selection process
structure

More About

Feature Selection

Algorithms

Extended Capabilities

Automatic Parallel Support
Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.

Version History

R2026a: Compute the criterion value for each candidate feature set in parallel (requires Parallel Computing Toolbox)

See Also

Topics

sequentialfs

Syntax

Description

Examples

Forward Feature Selection

Backward Feature Selection

Filter Type Feature Selection

Select Features in Table

Input Arguments

fun — Function to compute feature selection criterion function handle

X — Feature data numeric matrix

y — Responses (labels) column vector

X1,...,XN — Input data matrices

Name-Value Arguments

CV — Cross-validation option 10 (default) | positive integer | cvpartition object | "resubstitution" | "none"

MCReps — Number of Monte Carlo repetitions for cross-validation 1 (default) | positive integer

Direction — Direction of sequential search "forward" (default) | "backward"

KeepIn — Features to include [] (default) | logical vector | vector of positive integers

KeepOut — Features to exclude [] (default) | logical vector | vector of positive integers

NFeatures — Number of features to select [] (default) | positive integer

NullModel — Flag to include null model false or 0 (default) | true or 1

Options — Options for iterations and parallel computation statset("sequentialfs") (default) | structure returned by statset

Output Arguments

tf — Selected features logical vector

history — History of feature selection process structure

More About

Feature Selection

Algorithms

Extended Capabilities

Automatic Parallel Support Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.

Version History

R2026a: Compute the criterion value for each candidate feature set in parallel (requires Parallel Computing Toolbox)

See Also

Topics

`fun` — Function to compute feature selection criterion
function handle

`X` — Feature data
numeric matrix

`y` — Responses (labels)
column vector

`X1,...,XN` — Input data
matrices

`CV` — Cross-validation option
10 (default) | positive integer | `cvpartition` object | `"resubstitution"` | `"none"`

`MCReps` — Number of Monte Carlo repetitions for cross-validation
`1` (default) | positive integer

`Direction` — Direction of sequential search
`"forward"` (default) | `"backward"`

`KeepIn` — Features to include
`[]` (default) | logical vector | vector of positive integers

`KeepOut` — Features to exclude
`[]` (default) | logical vector | vector of positive integers

`NFeatures` — Number of features to select
`[]` (default) | positive integer

`NullModel` — Flag to include null model
`false` or `0` (default) | `true` or `1`

`Options` — Options for iterations and parallel computation
`statset("sequentialfs")` (default) | structure returned by `statset`

`tf` — Selected features
logical vector

`history` — History of feature selection process
structure

Automatic Parallel Support
Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.