checkGradients
Check first derivative function against finite-difference approximation
Since R2023b
Syntax
Description
compares the value of the supplied first derivative function in valid
= checkGradients(fun
,x0
)fun
at a
point near x0
against a finite-difference approximation. By default, the
comparison assumes that the function is an objective function. To check constraint
functions, set the IsConstraint
name-value argument to
true
.
specifies additional options using one or more name-value arguments, in addition to any of
the input argument combinations in the previous syntaxes. For example, you can set the
tolerance for the comparison, or specify that the comparison is for nonlinear constraint
functions.valid
= checkGradients(___,Name=Value
)
Examples
Check Objective Gradient
The rosen
function at the end of this example computes the Rosenbrock objective function and its gradient for a 2-D variable x
.
Check that the computed gradient in rosen
matches a finite-difference approximation near the point [2,4].
x0 = [2,4]; valid = checkGradients(@rosen,x0)
valid = logical
1
function [f,g] = rosen(x) f = 100*(x(1) - x(2)^2)^2 + (1 - x(2))^2; if nargout > 1 g(1) = 200*(x(1) - x(2)^2); g(2) = -400*x(2)*(x(1) - x(2)^2) - 2*(1 - x(2)); end end
Check Least-Squares Objective Gradient
The vecrosen
function at the end of this example computes the Rosenbrock objective function in least-squares form and its Jacobian (gradient).
Check that the computed gradient in vecrosen
matches a finite-difference approximation near the point [2,4].
x0 = [2,4]; valid = checkGradients(@vecrosen,x0)
valid = logical
1
function [f,g] = vecrosen(x) f = [10*(x(1) - x(2)^2),1-x(1)]; if nargout > 1 g = zeros(2); % Allocate g g(1,1) = 10; % df(1)/dx(1) g(1,2) = -20*x(2); % df(1)/dx(2) g(2,1) = -1; % df(2)/dx(1) g(2,2) = 0; % df(2)/dx(2) end end
Modify Finite-Difference Options
The rosen
function at the end of this example computes the Rosenbrock objective function and its gradient for a 2-D variable x
.
For some initial points, the default forward finite differences cause checkGradients
to mistakenly indicate that the rosen
function has incorrect gradients. To see result details, set the Display
option to "on"
.
x0 = [0,0];
valid = checkGradients(@rosen,x0,Display="on")
____________________________________________________________ Objective function derivatives: Maximum relative difference between supplied and finite-difference derivatives = 1.48826e-06. Supplied derivative element (1,1): -0.126021 Finite-difference derivative element (1,1): -0.126023 checkGradients failed. Supplied derivative and finite-difference approximation are not within 'Tolerance' (1e-06). ____________________________________________________________
valid = logical
0
checkGradients
reports a mismatch, with a difference of just over 1 in the sixth decimal place. Use central finite differences and check again.
opts = optimoptions("fmincon",FiniteDifferenceType="central"); valid = checkGradients(@rosen,x0,opts,Display="on")
____________________________________________________________ Objective function derivatives: Maximum relative difference between supplied and finite-difference derivatives = 1.29339e-11. checkGradients successfully passed. ____________________________________________________________
valid = logical
1
Central finite differences are generally more accurate. checkGradients
reports that the gradient and central finite-difference approximation match to about 11 decimal places.
function [f,g] = rosen(x) f = 100*(x(1) - x(2)^2)^2 + (1 - x(2))^2; if nargout > 1 g(1) = 200*(x(1) - x(2)^2); g(2) = -400*x(2)*(x(1) - x(2)^2) - 2*(1 - x(2)); end end
Check Gradients of Nonlinear Constraints
The tiltellipse
function at the end of this example imposes the constraint that the 2-D variable x
is confined to the interior of the tilted ellipse
.
Visualize the ellipse.
f = @(x,y) x.*y/2+(x+2).^2+(y-2).^2/2-2; fcontour(f,LevelList=0) axis([-6 0 -1 7])
Check the gradient of this nonlinear inequality constraint function.
x0 = [-2,6]; valid = checkGradients(@tiltellipse,x0,IsConstraint=true)
valid = 1x2 logical array
1 1
function [c,ceq,gc,gceq] = tiltellipse(x) c = x(1)*x(2)/2 + (x(1) + 2)^2 + (x(2)- 2)^2/2 - 2; ceq = []; if nargout > 2 gc = [x(2)/2 + 2*(x(1) + 2); x(1)/2 + x(2) - 2]; gceq = []; end end
Examine Differences Between Gradients and Finite-Difference Approximations
The fungrad
function at the end of this example correctly calculates the gradient of some components of the least-squares objective, and incorrectly calculates others.
Examine the second output of checkGradients
to see which components do not match well at the point [2,4]. To see result details, set the Display
option to "on"
.
x0 = [2,4];
[valid,err] = checkGradients(@fungrad,x0,Display="on")
____________________________________________________________ Objective function derivatives: Maximum relative difference between supplied and finite-difference derivatives = 0.749797. Supplied derivative element (3,2): 19.9838 Finite-difference derivative element (3,2): 5 checkGradients failed. Supplied derivative and finite-difference approximation are not within 'Tolerance' (1e-06). ____________________________________________________________
valid = logical
0
err = struct with fields:
Objective: [3x2 double]
The output shows that element [3,2] is incorrect. But is that the only problem? Examine err.Objective
and look for entries that are far from 0.
err.Objective
ans = 3×2
0.0000 0.0000
0.0000 0
0.5000 0.7498
Both the [3,1] and [3,2] elements of the derivative are incorrect. The fungrad2
function at the end of this example corrects the errors.
[valid,err] = checkGradients(@fungrad2,x0,Display="on")
____________________________________________________________ Objective function derivatives: Maximum relative difference between supplied and finite-difference derivatives = 2.2338e-08. checkGradients successfully passed. ____________________________________________________________
valid = logical
1
err = struct with fields:
Objective: [3x2 double]
err.Objective
ans = 3×2
10-7 ×
0.2234 0.0509
0.0003 0
0.0981 0.0042
All the differences between the gradient and finite-difference approximations are less than 1e-7 in magnitude.
This code creates the fungrad
helper function.
function [f,g] = fungrad(x) f = [10*(x(1) - x(2)^2),1 - x(1),5*(x(2) - x(1)^2)]; if nargout > 1 g = zeros(3,2); g(1,1) = 10; g(1,2) = -20*x(2); g(2,1) = -1; g(3,1) = -20*x(1); g(3,2) = 5*x(2); end end
This code creates the fungrad2
helper function.
function [f,g] = fungrad2(x) f = [10*(x(1) - x(2)^2),1 - x(1),5*(x(2) - x(1)^2)]; if nargout > 1 g = zeros(3,2); g(1,1) = 10; g(1,2) = -20*x(2); g(2,1) = -1; g(3,1) = -10*x(1); g(3,2) = 5; end end
Input Arguments
fun
— Function to check
function handle
Function to check, specified as a function handle.
If
fun
represents an objective function, thenfun
must have the following signature.[fval,grad] = fun(x)
checkGradients
compares the value ofgrad(x)
to a finite-difference approximation for a pointx
nearx0
. The comparison iswhere
grad
represents the value of the gradient function, andgrad_fd
represents the value of the finite-difference approximation.checkGradients
performs this division component-wise.If
fun
represents a least-squares objective, thenfun
is a vector, andgrad(x)
is a matrix representing the Jacobian offun
.If
fun
returns an array ofm
components andx
hasn
elements, wheren
is the number of elements ofx0
, the JacobianJ
is anm
-by-n
matrix whereJ(i,j)
is the partial derivative ofF(i)
with respect tox(j)
. (The JacobianJ
is the transpose of the gradient ofF
.)If
fun
represents a nonlinear constraint, thenfun
must have the following signature.[c,ceq,gc,gceq] = fun(x)
c
represents the nonlinear inequality constraints. Solvers attempt to achievec <= 0
. Thec
output can be a vector of any length.ceq
represents the nonlinear equality constraints. Solvers attempt to achieveceq = 0
. Theceq
output can be a vector of any length.gc
represents the gradient of the nonlinear inequality constraints.gceq
represents the gradient of the nonlinear equality constraints.
Data Types: function_handle
x0
— Location to check gradient
double array | 1-by-2 cell array
Location at which to check the gradient, specified as a double array for all solvers
except lsqcurvefit
. For lsqcurvefit
,
x0
is a 1-by-2 cell array
{x0array,xdata}
.
checkGradients
checks the gradient at a point near the specified
x0
. The function adds a small random direction to
x0
, no more than 1e-3
in absolute value. This
perturbation attempts to protect the check against a point where an incorrect gradient
function might pass because of cancellations.
Example: randn(5,1)
Data Types: double
Complex Number Support: Yes
options
— Finite differencing options
output of optimoptions
Finite differencing options, specified as the output of
optimoptions
. The following options affect finite
differencing.
Option | Description |
---|---|
FiniteDifferenceStepSize |
Scalar or vector step size factor for finite differences. When
you set
sign′(x) = sign(x) except sign′(0) = 1 .
Central finite differences are
FiniteDifferenceStepSize expands to a vector. The default
is sqrt(eps) for forward finite differences, and eps^(1/3)
for central finite differences.
|
FiniteDifferenceType | Finite differences used to estimate gradients
are either |
TypicalX | Typical |
DiffMaxChange (discouraged) | Maximum change in variables for
finite-difference gradients (a positive scalar). The default is
|
DiffMinChange (discouraged) | Minimum change in variables for
finite-difference gradients (a nonnegative scalar). The default is
|
Example: optimoptions("fmincon",FiniteDifferenceStepSize=1e-4)
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name
in quotes.
Example: IsConstraint=true,Tolerance=5e-4
Display
— Flag to display results at command line
"off"
(default) | "on"
Flag to display results at command line, specified as "off"
(do
not display the results) or "on"
(display the results).
Example: "off"
Data Types: char
| string
IsConstraint
— Flag to check nonlinear constraint gradients
false
(default) | true
Flag to check nonlinear constraint gradients, specified as
false
(the function is an objective function) or
true
(the function is a nonlinear constraint function).
Example: true
Data Types: logical
Tolerance
— Tolerance for gradient approximation
1e-6
(default) | nonnegative scalar
Tolerance for the gradient approximation, specified as a nonnegative scalar. The
returned value valid
is true
for each
component where the absolute relative difference between the gradient of
fun
and its finite-difference approximation is less than or
equal to Tolerance
.
Example: 1e-3
Data Types: double
Output Arguments
valid
— Indication that finite-difference approximation matches gradient
logical scalar | two-element logical vector
Indication that the finite-difference approximation matches the gradient, returned
as a logical scalar for objective functions or a two-element logical vector for
nonlinear constraint functions [c,ceq]
. The returned value
valid
is true
when the absolute relative
difference between the gradient of fun
and its finite-difference
approximation is less than or equal to Tolerance
for all components
of the gradient. Otherwise, valid
is false
.
When a nonlinear constraint c
or ceq
is empty,
the returned value of valid
for that constraint is
true
.
err
— Relative differences between gradients and finite-difference approximations
structure
Relative differences between the gradients and finite-difference approximations,
returned as a structure. For objective functions, the field name is
Objective
. For nonlinear constraint functions, the field names are
Inequality
(corresponding to c
) and
Equality
(corresponding to ceq
). Each component
of err
has the same shape as the supplied derivatives from
fun
.
More About
CheckGradients Failed
Gradients or Jacobians estimated near the initial point did not match the supplied
derivatives to within a default tolerance of 1e-6
or, for the
checkGradients
function, the specified Tolerance
value.
Usually, this failure means that your objective or nonlinear constraint functions have an incorrect derivative calculation. Double-check the indicated derivative.
Occasionally, the finite difference approximations to the derivatives are inaccurate enough to cause the failure. This inaccuracy can occur when the second derivative of a function (objective or nonlinear constraint) has a large magnitude. It can also occur when using the default
'forward'
finite differences, which are less accurate but faster than'central'
finite differences. If you think that the derivative functions are correct, try one or both of the following to see ifCheckGradients
passes:Set the
FiniteDifferenceType
option to'central'
.Set the
FiniteDifferenceStepSize
option to a small value such as1e-10
.
Derivatives are checked at a random point near the initial point. Therefore, the gradient check can pass or fail randomly when differing nearby points have differing check results.
For details, see Checking Validity of Gradients or Jacobians.
Version History
Introduced in R2023b
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)