fminsearch for a function f(x,y) = x^4 -3xy+2y^2

4 views (last 30 days)
can someone please explain what hapens when i usw fminsearch with a starting value of (0,0)

Answers (4)

David Hill
David Hill on 13 Oct 2022
Edited: David Hill on 13 Oct 2022
Always depends on the starting point and the steps during iterations
f=@(x)x(1).^4 -3*x(1).*x(2)+2*x(2).^2;
fminsearch(f,[0,0])
ans = 1×2
1.0e-04 * 0.9375 0.6250
fminsearch(f,[.5,.5])
ans = 1×2
0.7500 0.5625
  4 Comments
David Hill
David Hill on 13 Oct 2022
Very flat, hitting stopping criteria. Change TolFun and Tolx for higher precision and [0 0] falls into one of the two holes.
f = @(x,y) x.^4 -3*x.*y + 2*y.^2;
options = optimset('TolFun',1e-15,'TolX',1e-15);
[B1,fval1] = fminsearch(@(b)f(b(1),b(2)), [0 0],options)
B1 = 1×2
0.7500 0.5625
fval1 = -0.3164
[B1,fval1] = fminsearch(@(b)f(b(1),b(2)), [-.0001 -.0001],options)
B1 = 1×2
-0.7500 -0.5625
fval1 = -0.3164
[B1,fval1] = fminsearch(@(b)f(b(1),b(2)), [.0001 .0001],options)
B1 = 1×2
0.7500 0.5625
fval1 = -0.3164

Sign in to comment.


Star Strider
Star Strider on 13 Oct 2022
It depends how you set up the function, and whether you use the function by itself or the norm of the function.
The easiest way to find out is to do the experiment —
f = @(x,y) x.^4 -3*x.*y + 2*y.^2;
[B1,fval1] = fminsearch(@(b)f(b(1),b(2)), [0 0])
B1 = 1×2
1.0e-04 * 0.9375 0.6250
fval1 = -9.7656e-09
[B2,fval2] = fminsearch(@(b)norm(f(b(1),b(2))), [0 0])
B2 = 1×2
0 0
fval2 = 0
[B3,fval3] = fminsearch(@(b)norm(f(b(1),b(2))), [1 1]*100)
B3 = 1×2
-0.2599 -0.0059
fval3 = 9.6109e-11
[B4,fval4] = fminsearch(@(b)norm(f(b(1),b(2))), [1 1]*1E+12)
B4 = 1×2
0.1054 0.1577
fval4 = 1.6795e-11
Also, doing a surf plot of it or a contour or fimplicit plot could also be informative.
figure
fsurf(f, [-1 1 -1 1]*2)
grid on
% axis('equal')
figure
fcontour(f, [-1 1 -1 1]*2.5)
grid on
axis('equal')
.
  2 Comments
Star Strider
Star Strider on 13 Oct 2022
The asterisk is a common multiplier for the matrix (or vector).
If you are referring to this —
f = @(x,y) x.^4 -3*x.*y + 2*y.^2;
[B1,fval1] = fminsearch(@(b)f(b(1),b(2)), [0 0])
B1 = 1×2
1.0e-04 * 0.9375 0.6250
fval1 = -9.7656e-09
it simply means that ‘B1’ is multiplied by 0.0001.
Changing the format —
format longG
f = @(x,y) x.^4 -3*x.*y + 2*y.^2;
[B1,fval1] = fminsearch(@(b)f(b(1),b(2)), [0 0])
B1 = 1×2
1.0e+00 * 9.375e-05 6.25e-05
fval1 =
-9.76562492275239e-09
shows them as they actually appear.
.

Sign in to comment.


John D'Errico
John D'Errico on 13 Oct 2022
Edited: John D'Errico on 13 Oct 2022
This MUST be a homework question. But by now, you have multiple answers, that I think don't explain in any depth what you asked. (Star kind of discussed some of it, but there is no need at all to use norm here, since your function returns a scalar.)
What is the shape of that function near the origin? First, learn to write using operators! MATLAB does not have implicit multiplication.
fun = @(x,y) x.^4 -3*x.*y+2*y.^2
fun = function_handle with value:
@(x,y)x.^4-3*x.*y+2*y.^2
fcontour(fun,[-2,2,-2,2])
grid on
axis equal
Your function has a point if interest at the originn, but it is not a min or max. As you can see, the zero contours cross at that point. There are two local minima away from the origin, but the origin is a saddle point. We can see that here:
fsurf(fun,[-.1,.1,-.1,.1])
So there is a valley in one direction, where the function acually goes through a local max at the origin. But in the other direction, the function is minimized at the origin. This would be called a saddle point. You can identify it from the contours themselves, if you recognize that characteristic shape, where the zero contour wants to cross itself.
(To go into way more depth than this problem merits, I would point out that the Hessian matrix at that point for this function is a known as a defective matrix, lacking a complete set of eigenvectors.)
syms X Y
Gradxy = gradient(fun(X,Y))
Gradxy = 
xystat = solve(Gradxy == 0)
xystat = struct with fields:
X: [3×1 sym] Y: [3×1 sym]
[xystat.X,xystat.Y]
ans = 
So there are three stationary points, one of them at the origin, and the other two live in the first and third quadrants.
Now, try using fminsearch on this function of two variables.
opts.Display = 'iter';
funxy = @(xy) fun(xy(1),xy(2)); % make it a function of a vector of length 2
[xysol,fval] = fminsearch(funxy,[0 0],opts)
Iteration Func-count min f(x) Procedure 0 1 0 1 3 0 initial simplex 2 5 0 contract inside 3 7 -9.76562e-09 contract inside 4 9 -9.76562e-09 contract inside Optimization terminated: the current x satisfies the termination criteria using OPTIONS.TolX of 1.000000e-04 and F(X) satisfies the convergence criteria using OPTIONS.TolFun of 1.000000e-04
xysol = 1×2
1.0e-04 * 0.9375 0.6250
fval = -9.7656e-09
As you can see, it returns a solution that is not a local min. But at the same time, it does not return the point [0,0]. So what happened?
Fminsearch starts out by choosing a small initial simplex, with one vertex at the start point. But in this case, all of the function values at that initial point were small, and close to each other. fminsearch decided to then look INSIDE the simplex, rather then look further afield, where it might have decided to find one of the global minima. After only a few such iterations, fminsearch decided that where you started it out was actually a solution. As you can see, it reports that the point it has does satisfy all the convergence criteria.
When instead, you start the solver out somewhere else, it can then converge to a valid solution.
[xysol,fval] = fminsearch(funxy,[-2,3],opts)
Iteration Func-count min f(x) Procedure 0 1 52 1 3 52 initial simplex 2 5 48.7139 expand 3 7 42.2962 expand 4 9 38.203 expand 5 11 28.8155 expand 6 12 28.8155 reflect 7 14 15.5092 expand 8 15 15.5092 reflect 9 17 12.7947 reflect 10 19 12.6648 contract inside 11 21 11.8766 reflect 12 23 10.1094 reflect 13 24 10.1094 reflect 14 26 7.8445 reflect 15 28 6.99602 contract inside 16 30 5.6339 reflect 17 32 5.6339 contract inside 18 34 4.11817 expand 19 36 3.04437 expand 20 38 0.109241 expand 21 39 0.109241 reflect 22 40 0.109241 reflect 23 42 -0.202736 reflect 24 44 -0.204595 contract inside 25 46 -0.268854 contract inside 26 48 -0.29262 contract inside 27 50 -0.303488 contract inside 28 52 -0.311096 contract inside 29 54 -0.311096 contract inside 30 56 -0.314523 contract inside 31 58 -0.314557 contract inside 32 60 -0.315478 contract inside 33 62 -0.31577 contract inside 34 64 -0.316226 reflect 35 66 -0.316226 contract outside 36 68 -0.316253 contract inside 37 70 -0.316358 contract inside 38 72 -0.316402 contract inside 39 74 -0.316402 contract outside 40 76 -0.316402 contract inside 41 78 -0.316404 contract outside 42 80 -0.316405 contract inside 43 82 -0.316406 contract inside 44 84 -0.316406 contract outside 45 86 -0.316406 contract inside 46 88 -0.316406 contract inside 47 90 -0.316406 contract outside 48 92 -0.316406 contract inside 49 94 -0.316406 contract inside 50 96 -0.316406 contract inside 51 98 -0.316406 contract outside 52 100 -0.316406 contract inside Optimization terminated: the current x satisfies the termination criteria using OPTIONS.TolX of 1.000000e-04 and F(X) satisfies the convergence criteria using OPTIONS.TolFun of 1.000000e-04
xysol = 1×2
0.7500 0.5625
fval = -0.3164
So here it does manage to escape the stable point at [0,0],
Essentially, fminsearch has a problem because the point [0,0] is a stable point on the surface, where the gradient is exactly zero. So your function is perfectly flat there, and fminsearch becomes confused, because the initial simplex is small enough that the function appears close enough to constant that it gives up.
We can test that claim, by changing the parameters for fminsearch.
optimset('fminsearch')
ans = struct with fields:
Display: 'notify' MaxFunEvals: '200*numberofvariables' MaxIter: '200*numberofvariables' TolFun: 1.0000e-04 TolX: 1.0000e-04 FunValCheck: 'off' OutputFcn: [] PlotFcns: [] ActiveConstrTol: [] Algorithm: [] AlwaysHonorConstraints: [] DerivativeCheck: [] Diagnostics: [] DiffMaxChange: [] DiffMinChange: [] FinDiffRelStep: [] FinDiffType: [] GoalsExactAchieve: [] GradConstr: [] GradObj: [] HessFcn: [] Hessian: [] HessMult: [] HessPattern: [] HessUpdate: [] InitBarrierParam: [] InitTrustRegionRadius: [] Jacobian: [] JacobMult: [] JacobPattern: [] LargeScale: [] MaxNodes: [] MaxPCGIter: [] MaxProjCGIter: [] MaxSQPIter: [] MaxTime: [] MeritFunction: [] MinAbsMax: [] NoStopIfFlatInfeas: [] ObjectiveLimit: [] PhaseOneTotalScaling: [] Preconditioner: [] PrecondBandWidth: [] RelLineSrchBnd: [] RelLineSrchBndDuration: [] ScaleProblem: [] SubproblemAlgorithm: [] TolCon: [] TolConSQP: [] TolGradCon: [] TolPCG: [] TolProjCG: [] TolProjCGAbs: [] TypicalX: [] UseParallel: []
Do you see the initial tolerance for fminsearch are actually pretty large in this context? Ill change TolFun, so that fminsearch does not think it can terminate so easily.
opts.TolFun = 1e-12;
[xysol,fval] = fminsearch(funxy,[0 0],opts)
Iteration Func-count min f(x) Procedure 0 1 0 1 3 0 initial simplex 2 5 0 contract inside 3 7 -9.76562e-09 contract inside 4 9 -9.76562e-09 contract inside 5 11 -1.0025e-07 expand 6 12 -1.0025e-07 reflect 7 14 -5.54321e-07 expand 8 15 -5.54321e-07 reflect 9 17 -2.50625e-06 expand 10 19 -4.0937e-06 expand 11 21 -1.54151e-05 expand 12 23 -3.30708e-05 expand 13 25 -0.000109659 expand 14 27 -0.000271442 expand 15 29 -0.000835767 expand 16 31 -0.00221716 expand 17 33 -0.00652699 expand 18 35 -0.0176627 expand 19 37 -0.0486566 expand 20 39 -0.116327 expand 21 41 -0.166701 expand 22 43 -0.166701 contract inside 23 44 -0.166701 reflect 24 46 -0.166701 contract inside 25 48 -0.172698 contract inside 26 50 -0.174456 contract inside 27 52 -0.174463 contract outside 28 54 -0.174691 contract inside 29 56 -0.174931 reflect 30 58 -0.174931 contract inside 31 60 -0.174998 reflect 32 62 -0.175176 expand 33 64 -0.175294 reflect 34 66 -0.175417 reflect 35 68 -0.175587 reflect 36 70 -0.175653 reflect 37 72 -0.175894 expand 38 74 -0.176437 expand 39 76 -0.176641 reflect 40 78 -0.177382 expand 41 80 -0.178972 expand 42 82 -0.179331 reflect 43 84 -0.18251 expand 44 86 -0.185288 expand 45 88 -0.192027 expand 46 90 -0.202334 expand 47 92 -0.215636 expand 48 94 -0.245401 expand 49 96 -0.257113 expand 50 98 -0.309877 expand 51 99 -0.309877 reflect 52 100 -0.309877 reflect 53 101 -0.309877 reflect 54 103 -0.309877 contract inside 55 105 -0.31513 contract inside 56 107 -0.31513 contract inside 57 109 -0.315234 contract inside 58 111 -0.315862 reflect 59 113 -0.316307 contract inside 60 115 -0.316307 contract inside 61 117 -0.316307 contract inside 62 119 -0.316383 contract inside 63 121 -0.316383 contract outside 64 123 -0.316383 contract outside 65 125 -0.316404 contract inside 66 126 -0.316404 reflect 67 128 -0.316404 contract inside 68 130 -0.316405 contract inside 69 132 -0.316406 contract inside 70 134 -0.316406 contract inside 71 136 -0.316406 contract outside 72 138 -0.316406 contract inside 73 140 -0.316406 contract inside 74 142 -0.316406 contract outside 75 144 -0.316406 contract inside 76 146 -0.316406 contract inside 77 148 -0.316406 contract inside 78 150 -0.316406 contract outside 79 152 -0.316406 contract inside 80 154 -0.316406 reflect 81 156 -0.316406 contract outside 82 158 -0.316406 contract inside 83 160 -0.316406 contract inside 84 162 -0.316406 contract inside 85 164 -0.316406 contract inside 86 166 -0.316406 contract inside 87 168 -0.316406 contract inside 88 170 -0.316406 contract inside 89 172 -0.316406 contract inside 90 174 -0.316406 contract outside 91 176 -0.316406 contract inside Optimization terminated: the current x satisfies the termination criteria using OPTIONS.TolX of 1.000000e-04 and F(X) satisfies the convergence criteria using OPTIONS.TolFun of 1.000000e-12
xysol = 1×2
0.7500 0.5625
fval = -0.3164
Indeed now, fminsearch finds one of the global solutions. Understanding the general algorithm for fminsearch is a valuable thing, as it can help you to see what fminsearch is doing, and to then diagnose stange behavior like this.

David Goodmanson
David Goodmanson on 13 Oct 2022
Edited: David Goodmanson on 14 Oct 2022
Hi Ibraheem,
While it's useful to find out about how to get fminsearch to cooperate, that's hardly necessary here. For the minimum, you can calculate the gradient and set it to zero:
f(x,y) = x^4 -3xy +2y^2
grad f(x,y) = (4x^3 - 3y) ux + (4y -3x) uy
where ux and uy are unit vectors along x and y respectively:
grad_y = 0 --> y = (3/4)x.
plug that into grad_x.
grad_x = 0 --> 4x^3 - (9/4)x = 0 --> x^2 = 9/16 (also x = 0)
so
x = 3/4 y = 9/16
x = -3/4 y = -9/16
as is verified by the posted plots.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by