ExhaustiveSearcher
Create exhaustive nearest neighbor searcher
Description
ExhaustiveSearcher
model objects store the training data,
distance metric, and parameter values of the distance metric for an exhaustive nearest
neighbor search. The exhaustive search algorithm finds the distance from each query
observation to all n observations in the training data, which is an
n-by-K numeric matrix.
Once you create an ExhaustiveSearcher
model object, find
neighboring points in the training data to the query data by performing a nearest
neighbor search using knnsearch
or a radius search using
rangesearch
. The exhaustive search
algorithm is more efficient than the Kd-tree algorithm when
K is large (that is, K > 10), and it is more
flexible than the Kd-tree algorithm with respect to distance metric
choices. The ExhaustiveSearcher
model object also supports sparse
data.
Creation
Use either the createns
function or the
ExhaustiveSearcher
function (described here) to create an
ExhaustiveSearcher
object. Both functions use the same syntax
except that the createns
function has the 'NSMethod'
name-value pair
argument, which you use to choose the nearest neighbor search method. The
createns
function also creates a KDTreeSearcher
object. Specify 'NSMethod','exhaustive'
to create an ExhaustiveSearcher
object. The default is
'exhaustive'
if K > 10, the training data is
sparse, or the distance metric is not the Euclidean, city block, Chebychev, or
Minkowski.
Description
creates an exhaustive nearest neighbor searcher object (Mdl
= ExhaustiveSearcher(X
)Mdl
)
using the n-by-K numeric matrix of
training data (X
).
specifies additional options using one or more name-value pair arguments. You
can specify the distance metric and set the distance metric parameter (Mdl
= ExhaustiveSearcher(X
,Name,Value
)DistParameter
)
property. For example,
ExhaustiveSearcher(X,'Distance','chebychev')
creates an
exhaustive nearest neighbor searcher object that uses the Chebychev distance. To
specify DistParameter
, use the Cov
,
P
, or Scale
name-value pair
argument.
Input Arguments
X
— Training data
numeric matrix
Training data that prepares the exhaustive searcher algorithm,
specified as a numeric matrix. X
has
n rows, each corresponding to an observation
(that is, an instance or example), and K columns,
each corresponding to a predictor (that is, a feature).
Data Types: single
| double
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name
in quotes.
Example: 'Distance','mahalanobis','Cov',eye(3)
specifies to
use the Mahalanobis distance when searching for nearest neighbors and a 3-by-3
identity matrix for the covariance matrix in the Mahalanobis distance
metric.
Distance
— Distance metric
'euclidean'
(default) | character vector | string scalar | custom distance function
Distance metric used when you call knnsearch
or rangesearch
to find
nearest neighbors for future query points, specified as the
comma-separated pair consisting of 'Distance'
and
a character vector, string scalar, or function handle.
This table describes the supported distance metrics specified as character vectors or string scalars.
Value | Description |
---|---|
'chebychev' | Chebychev distance (maximum coordinate difference). |
'cityblock' | City block distance. |
'correlation' | One minus the sample linear correlation between observations (treated as sequences of values). |
'cosine' | One minus the cosine of the included angle between observations (treated as row vectors). |
'euclidean' | Euclidean distance. |
'hamming' | Hamming distance, which is the percentage of coordinates that differ. |
'jaccard' | One minus the Jaccard coefficient, which is the percentage of nonzero coordinates that differ. |
'minkowski' | Minkowski distance. The default exponent is 2. To
specify a different exponent, use the
'P' name-value pair
argument. |
'mahalanobis' | Mahalanobis distance, computed using a positive
definite covariance matrix. To change the value of
the covariance matrix, use the
'Cov' name-value pair
argument. |
'seuclidean' | Standardized Euclidean distance. Each coordinate
difference between rows in Mdl.X
and the query matrix is scaled by dividing by the
corresponding element of the standard deviation
computed from Mdl.X . To specify
another scaling, use the
'Scale' name-value pair
argument. |
'spearman' | One minus the sample Spearman's rank correlation between observations (treated as sequences of values). |
For more details, see Distance Metrics.
You can specify a function handle for a custom distance metric by
using @
(for example,
@distfun
). A custom distance function must:
Have the form
function D2 = distfun(ZI, ZJ)
.Take as arguments:
A 1-by-K vector
ZI
containing a single row fromX
or from the query pointsY
, where K is the number of columns inX
.An m-by-K matrix
ZJ
containing multiple rows ofX
orY
, where m is a positive integer.
Return an m-by-1 vector of distance
D2
, whereD2(
is the distance between the observationsj
)ZI
andZJ(
.j
,:)
The software does not use the distance metric for creating an
ExhaustiveSearcher
model object, so you can
alter the distance metric by using dot notation after creating the
object.
Example: 'Distance','mahalanobis'
Cov
— Covariance matrix for Mahalanobis distance metric
cov(X,'omitrows')
(default) | positive definite matrix
Covariance matrix for the Mahalanobis distance metric, specified as the comma-separated pair
consisting of 'Cov'
and a
K-by-K positive definite matrix, where
K is the number of columns in X
. This
argument is valid only if 'Distance'
is
'mahalanobis'
.
Example: 'Cov',eye(3)
Data Types: single
| double
P
— Exponent for Minkowski distance metric
2
(default) | positive scalar
Exponent for the Minkowski distance metric, specified as the comma-separated pair consisting
of 'P'
and a positive scalar. This argument is valid only if
'Distance'
is 'minkowski'
.
Example: 'P',3
Data Types: single
| double
Scale
— Scale parameter value for standardized Euclidean distance metric
std(X,'omitnan')
(default) | nonnegative numeric vector
Scale parameter value for the standardized Euclidean distance metric, specified as the
comma-separated pair consisting of 'Scale'
and a nonnegative numeric
vector of length K, where K is the number of
columns in X
. The software scales each difference between the
training and query data using the corresponding element of Scale
.
This argument is valid only if 'Distance'
is
'seuclidean'
.
Example: 'Scale',quantile(X,0.75) - quantile(X,0.25)
Data Types: single
| double
Properties
X
— Training data
numeric matrix
This property is read-only.
Training data that prepares the exhaustive searcher algorithm, specified
as a numeric matrix. X
has n rows,
each corresponding to an observation (that is, an instance or example), and
K columns, each corresponding to a predictor (that
is, a feature).
The input argument X
of createns
or
ExhaustiveSearcher
sets this property.
Data Types: single
| double
Distance
— Distance metric
character vector | string scalar | custom distance function
Distance metric used when you call knnsearch
or rangesearch
to find nearest
neighbors for future query points, specified as a character vector or string
scalar ('chebychev'
, 'cityblock'
,
'correlation'
, 'cosine'
,
'euclidean'
, 'hamming'
,
'jaccard'
, 'minkowski'
,
'mahalanobis'
, 'seuclidean'
, or
'spearman'
), or a function handle.
The 'Distance'
name-value pair argument of createns
or
ExhaustiveSearcher
sets this property.
The software does not use the distance metric for creating an
ExhaustiveSearcher
model object, so you can alter it
by using dot notation.
DistParameter
— Distance metric parameter values
[]
| positive scalar
Distance metric parameter values, specified as empty
([]
) or a positive scalar.
This table describes the distance parameters of the supported distance metrics.
Distance Metric | Parameter Description |
---|---|
'mahalanobis' | A positive definite matrix representing the
covariance matrix used for computing the Mahalanobis
distance. By default, the software sets the covariance
using
The
You can alter
|
'minkowski' | A positive scalar indicating the exponent of the
Minkowski distance. By default, the exponent is
The
You can alter
|
'seuclidean' | A positive numeric vector indicating the values used by the software to scale the predictors when computing the standardized Euclidean distance. By default, the software:
The
You can alter
|
If Mdl.Distance
is not one of the parameters listed
in this table, then Mdl.DistParameter
is
[]
, which means that the specified distance metric
formula has no parameters.
Data Types: single
| double
Object Functions
knnsearch | Find k-nearest neighbors using searcher object |
rangesearch | Find all neighbors within specified distance using searcher object |
Examples
Train Default Exhaustive Nearest Neighbor Searcher
Load Fisher's iris data set.
load fisheriris
X = meas;
[n,k] = size(X)
n = 150
k = 4
X
has 150 observations and 4 predictors.
Prepare an exhaustive nearest neighbor searcher using the entire data set as training data.
Mdl1 = ExhaustiveSearcher(X)
Mdl1 = ExhaustiveSearcher with properties: Distance: 'euclidean' DistParameter: [] X: [150x4 double]
Mdl1
is an ExhaustiveSearcher
model object, and its properties appear in the Command Window. The object contains information about the trained algorithm, such as the distance metric. You can alter property values using dot notation.
Alternatively, you can prepare an exhaustive nearest neighbor searcher by using createns
and specifying 'exhaustive'
as the search method.
Mdl2 = createns(X,'NSMethod','exhaustive')
Mdl2 = ExhaustiveSearcher with properties: Distance: 'euclidean' DistParameter: [] X: [150x4 double]
Mdl2
is also an ExhaustiveSearcher
model object, and it is equivalent to Mdl1
.
To search X
for the nearest neighbors to a batch of query data, pass the ExhaustiveSearcher
model object and the query data to knnsearch
or rangesearch
.
Specify the Mahalanobis Distance for Nearest Neighbor Search
Load Fisher's iris data set. Focus on the petal dimensions.
load fisheriris X = meas(:,[3 4]); % Predictors
Prepare an exhaustive nearest neighbor searcher. Specify the Mahalanobis distance metric.
Mdl = createns(X,'Distance','mahalanobis')
Mdl = ExhaustiveSearcher with properties: Distance: 'mahalanobis' DistParameter: [2x2 double] X: [150x2 double]
Because the distance metric is Mahalanobis, createns
creates an ExhaustiveSearcher
model object by default.
Access properties of Mdl
by using dot notation. For example, use Mdl.DistParameter
to access the Mahalanobis covariance parameter.
Mdl.DistParameter
ans = 2×2
3.1163 1.2956
1.2956 0.5810
You can pass query data and Mdl
to:
knnsearch
to find indices and distances of nearest neighborsrangesearch
to find indices of all nearest neighbors within a distance that you specify
Alter Properties of ExhaustiveSearcher
Model
Create an ExhaustiveSearcher
model object and alter the Distance
property by using dot notation.
Load Fisher's iris data set.
load fisheriris
X = meas;
Train a default exhaustive searcher algorithm using the entire data set as training data.
Mdl = ExhaustiveSearcher(X)
Mdl = ExhaustiveSearcher with properties: Distance: 'euclidean' DistParameter: [] X: [150x4 double]
Specify that the neighbor searcher use the Mahalanobis metric to compute the distances between the training and query data.
Mdl.Distance = 'mahalanobis'
Mdl = ExhaustiveSearcher with properties: Distance: 'mahalanobis' DistParameter: [4x4 double] X: [150x4 double]
You can pass Mdl
and the query data to either knnsearch
or rangesearch
to find the nearest neighbors to the points in the query data based on the Mahalanobis distance.
Search for Nearest Neighbors of Query Data Using Mahalanobis Distance
Create an exhaustive searcher object by using the createns
function. Pass the object and query data to the knnsearch
function to find k-nearest neighbors.
Load Fisher's iris data set.
load fisheriris
Remove five irises randomly from the predictor data to use as a query set.
rng('default'); % For reproducibility n = size(meas,1); % Sample size qIdx = randsample(n,5); % Indices of query data X = meas(~ismember(1:n,qIdx),:); Y = meas(qIdx,:);
Prepare an exhaustive nearest neighbor searcher using the training data. Specify the Mahalanobis distance for finding nearest neighbors.
Mdl = createns(X,'Distance','mahalanobis')
Mdl = ExhaustiveSearcher with properties: Distance: 'mahalanobis' DistParameter: [4x4 double] X: [145x4 double]
Because the distance metric is Mahalanobis, createns
creates an ExhaustiveSearcher
model object by default.
The software uses the covariance matrix of the predictors (columns) in the training data for computing the Mahalanobis distance. To display this value, use Mdl.DistParameter
.
Mdl.DistParameter
ans = 4×4
0.6547 -0.0368 1.2320 0.5026
-0.0368 0.1914 -0.3227 -0.1193
1.2320 -0.3227 3.0671 1.2842
0.5026 -0.1193 1.2842 0.5800
Find the indices of the training data (Mdl.X
) that are the two nearest neighbors of each point in the query data (Y
).
IdxNN = knnsearch(Mdl,Y,'K',2)
IdxNN = 5×2
5 6
98 95
104 128
135 65
102 115
Each row of IdxNN
corresponds to a query data observation. The column order corresponds to the order of the nearest neighbors with respect to ascending distance. For example, based on the Mahalanobis metric, the second nearest neighbor of Y(3,:)
is X(128,:)
.
Extended Capabilities
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
Usage notes and limitations:
The
knnsearch
andrangesearch
functions support code generation.When you train an
ExhaustiveSearcher
model object, the value of the'Distance'
name-value pair argument cannot be a custom distance function.
For more information, see Introduction to Code Generation and Code Generation for Nearest Neighbor Searcher.
Version History
Introduced in R2010a
Beispiel öffnen
Sie haben eine geänderte Version dieses Beispiels. Möchten Sie dieses Beispiel mit Ihren Änderungen öffnen?
MATLAB-Befehl
Sie haben auf einen Link geklickt, der diesem MATLAB-Befehl entspricht:
Führen Sie den Befehl durch Eingabe in das MATLAB-Befehlsfenster aus. Webbrowser unterstützen keine MATLAB-Befehle.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)