Copyright Carl Edward Rasmussen, 2006-04-06.

Minimize

The matlab function minimize.m finds a (local) minimum of a (nonlinear) multivariate function. The user must supply a function which returns the value and partial derivatives wrt all variables. The function uses conjugate gradients and approximate linesearches based on polynomial interpolation with Wolfe-Powel conditions.

This is a slighlty updated version of minimize.m. The most important change is that minimize.m now (mostly) gracefully recovers if the function cannot be evaluated for some parameters (eg for numerical reasons). In addition, the code is now more readable and a few other tiny modifications and bug fixes have been made. The old version (from 2002) is obsolete (but still avaiable here).

How to use the minimize function

This is an example of how to use the minimize function. First you need to supply a function which returns function values and a vector of partial derivatives of the function. As an example, we will use the Rosenbrock function, see rosenbrock.m. (If the function requires other arguments, you can pass them as additional arguments to minimize.m - they will simply be passed on to the function.)

To start from [0 0]' and allow a maximum of 25 linesearches to minimize the function, do:
>> [x fx c] = minimize([0 0]', 'rosenbrock', 25)
Linesearch     18;  Value 4.930381e-32
x =
   1.00000000000000
   1.00000000000000
fx =
   1.00000000000000
   0.77160942667725
   0.58224024884105
   0.40492742502160
   0.32466327341368
   0.28960411147824
   0.07623420070067
   0.06786211944378
   0.03378423679313
   0.00108990808914
   0.00108795243321
   0.00008974308332
   0.00000012183819
   0.00000000675602
   0.00000000000000
   0.00000000000000
   0.00000000000000
   0.00000000000000
   0.00000000000000
c =
    20
this shows that the minimum (at x = [1 1]') was found after c = 20 iterations and fx contains the function value after each iteration. If instead we want to limit the search to 25 function evaluations:
>> [x fx c] = minimize([0 0]', 'rosenbrock', -25)
Function evaluation     22;  Value 6.786212e-02
x =
   0.79090360290176
   0.62236410995852
fx =
   1.00000000000000
   0.77160942667725
   0.58224024884105
   0.40492742502160
   0.32466327341368
   0.28960411147824
   0.07623420070067
   0.06786211944378
c =
    25
where the function value has been reduced to 0.068. Note that in this example each iteration (linesearch) used an average of around 3 function evaluations.

How efficient is minimize?

This question is very difficult to answer generally. However, for more challenging problems many people have found that minimize is very efficient compared to other routines. For example, using the matlab optimiztion toolbox, (Version 2.0 (R11) 09-Oct-1998) for a 100 dimensional Rosenbrock problem:
>> x0 = zeros(100, 1);
>> a = fminunc('rosenbrock', x0, optimset('GradObj', 'on', 'MaxFunEval', 20));
>> disp([rosenbrock(x0) rosenbrock(a)])
  99.00000000000000  88.38645133447895
reduces the function from 99 to around 88 using 2121 function and gradient evaluations (don't ask me why it calls the function 2121 times when MaxFunEval is set to 20). Using minimize:
>> x0 = zeros(100, 1);
>> a = minimize(x0, 'rosenbrock', -2121);
Function evaluation   2119;  Value 1.851149e-25
>> disp([rosenbrock(x0) rosenbrock(a)])
  99.00000000000000   0.00000000000000
showing considerably more progress in the same number of function evaluations. For larger problems, the difference is even more severe: for the 1000 dimensional Rosenbrock problem, minimize converges in 10177 function evaluations to a value of 8.296869e-27, whereas fminunc reduces the function value from 999 (at zero) to 980 in 21021 function evaluations (if you want to replicate this experiment, be aware that for some reason matlab's fminunc is computationally extremely slow for the 1000 dimensional problem).

Make sure that your partial derivatives are computed correctly!

Once you have written a function to minimize, it may be wise to check that the function values and partial derivatives a consistent. If they are not consistent, then minimize will probably claim that no more progress can be made after only a few iterations. You may use the checkgrad (checkgrad.m) function, to compare the partial derivatives with finite difference approximations. Eg, we could try:

>> randn('seed', 21);
>> checkgrad('rosenbrock', randn(3,1), 1e-5)
   1.0e+02 *
   9.91926964244770   9.91926964286449
  -3.76300059587095  -3.76300059616597
   0.36091809447947   0.36091809448635
ans =
     2.405447213023161e-11
to verify that the partial derivatives and finite differences approximation have very close values at some random point.