Optimization by adding a penalization term

Question

0 Stimmen

Hello, I write the following function that minimize an objective function by adding a penalization term:

function [theta_opt, fval]=optimization_P0_RMDF_Penalise_without_laguerreValue(X,D,a,eta,sigma)
[n,p]=size(X);
    function f = objectif_function(theta)
        fun=0;
        for i=1:n
            for j=i+1:n
                S=0;
                for k=1:p
                    S=S+(X(i,k)+theta(i,k)-X(j,k)-theta(j,k))^2;
                end
                lambda(i,j)=S;
                fun=fun+D(i,j)^2+2*sigma^2*a^2*p+a^2*lambda(i,j)-2*sqrt(pi)*a*sigma*D(i,j)*laguerreL(1/2,p/2,-(lambda(i,j)/(4*sigma^2)));
            end
        end
%%%%%%%%Penalization term%%%%%%%%%%%%
      nb_disp=0;
      for i=1:n
          for k=1:p
           if theta(i,k)~=0
            nb_disp=nb_disp+1;
           end
          end
      end  
     f=fun+eta*nb_disp;
  end
theta0=zeros(n,p);
options = optimset('Display','iter','Algorithm','active-set');
[theta_opt,fval] = fminunc(@objectif_function,theta0,options);
end

My problem that the optimzation is made without taking into account the penalization term. The result of optimization for any value of $\eta$ is similar to that obtained by taking eta=0 which means that the part "penalization term" is not considered during the optimization .

Can someone help me to fix thisd problem.

Thanks in advance,

Heborvi

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Follow Question

Answer 1

Matt J am 3 Okt. 2016

0 Stimmen

Your penalty term is a discrete function of theta, and hence not differentiable. This violates the assumptions of fminunc. Moreover, because the penalty function is piece-wise flat, it has zero gradient almost everywhere, which would likely explain why it's hard to get it to move, for small values of eta.

2 Kommentare
Keine anzeigen Keine ausblenden

Heborvi am 3 Okt. 2016

Bearbeitet: Heborvi am 3 Okt. 2016

Thank you for your answer.

So it is needed to choose a large value of eta? I have tested different values of eta and I still obtain the same result!

Matt J am 3 Okt. 2016

Bearbeitet: Matt J am 3 Okt. 2016

Because the gradient and Hessian is zero almost everywhere, the search directions, calculated by fminunc using function derivatives, are the same almost everywhere as when eta=0.

Melden Sie sich an, um zu kommentieren.

Answer 2

John D'Errico am 3 Okt. 2016

Bearbeitet: John D'Errico am 3 Okt. 2016

0 Stimmen

A major part of your problem is that you have created a discontinuous objective by adding discrete integer amounts to the objective as a penalty. Then you want to use fminunc to optimize the problem.

You clearly do not appreciate that fminunc REQUIRES a differentiable objective. That you call it a penalty term, as opposed to the objective is irrelevant. fminunc sees a discontinuous objective function. That will cause it to do unpredictable things.

In fact, this is abad thing to do to virtually ANY optimizer, so I am not sure who suggested the idea to you, or where you got it from.

My guess is that your penalty, if it is your goal that this term be close to zero, should be a simple function of the distance away from the goal. Quadratic penalties might make sense to you, maybe even exponential in some form. I can't say. Note that a LINEAR penalty function would again be wrong, since then your objective is again non-differentiable.

2 Kommentare
Keine anzeigen Keine ausblenden

Heborvi am 3 Okt. 2016

Bearbeitet: Heborvi am 3 Okt. 2016

Thank you for you answer,

In fact my penalty term is related to the parsimonious choice of vectors theta, therefore I use the l0-norm to do this. So, in my case I cant use fminunc to do my optimization?

John D'Errico am 3 Okt. 2016

Fminunc is absolutely out of the question. And since a smoothly increasing penalty is not an option, you cannot really use any optimizer that assumes differentiability.

I think you need to be looking for a mixed integer programming tool, that can handle nonlinear objectives.

Melden Sie sich an, um zu kommentieren.

Answer 3

Matt J am 4 Okt. 2016

Bearbeitet: Matt J am 4 Okt. 2016

In MATLAB Online öffnen

0 Stimmen

In fact my penalty term is related to the parsimonious choice of vectors theta, therefore I use the l0-norm to do this.

Often people compromise by using the l1-norm instead. This can be formulated differentiably by minimizing

   min. fun(theta)+eta*sum(r)
   s.t. r(i) >= theta(i) , 
        r(i) >= -theta(i)

where we have introduced additional unknown variables, r(i). This will require fmincon, as opposed to fminunc.

6 Kommentare
4 ältere Kommentare anzeigen 4 ältere Kommentare ausblenden

Heborvi am 4 Okt. 2016

And the sparsity of theta is taken into account by using this formulation?

Matt J am 4 Okt. 2016

Insofar as norm(theta,1) is an approximation of norm(theta,0), yes.

Melden Sie sich an, um zu kommentieren.

Optimization by adding a penalization term

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Antworten (3)

2 Kommentare
Keine anzeigen Keine ausblenden

2 Kommentare
Keine anzeigen Keine ausblenden

6 Kommentare
4 ältere Kommentare anzeigen 4 ältere Kommentare ausblenden

Kategorien

Tags

Community Treasure Hunt

Optimization by adding a penalization term

0 Kommentare -2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Antworten (3)

2 Kommentare Keine anzeigen Keine ausblenden

2 Kommentare Keine anzeigen Keine ausblenden

6 Kommentare 4 ältere Kommentare anzeigen 4 ältere Kommentare ausblenden

Kategorien

Tags

Siehe auch

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

2 Kommentare
Keine anzeigen Keine ausblenden

2 Kommentare
Keine anzeigen Keine ausblenden

6 Kommentare
4 ältere Kommentare anzeigen 4 ältere Kommentare ausblenden