Copyright Carl Edward Rasmussen, 2006-04-06.
The matlab function minimize.m finds a (local) minimum of a (nonlinear) multivariate function. The user must supply a function which returns the value and partial derivatives wrt all variables. The function uses conjugate gradients and approximate linesearches based on polynomial interpolation with Wolfe-Powel conditions.
This is a slighlty updated version of minimize.m. The most important change is that minimize.m now (mostly) gracefully recovers if the function cannot be evaluated for some parameters (eg for numerical reasons). In addition, the code is now more readable and a few other tiny modifications and bug fixes have been made. The old version (from 2002) is obsolete (but still avaiable here).
This is an example of how to use the minimize function. First you need to supply a function which returns function values and a vector of partial derivatives of the function. As an example, we will use the Rosenbrock function, see rosenbrock.m. (If the function requires other arguments, you can pass them as additional arguments to minimize.m - they will simply be passed on to the function.)
To start from [0 0]' and allow a maximum of 25 linesearches to minimize the function, do:>> [x fx c] = minimize([0 0]', 'rosenbrock', 25) Linesearch 18; Value 4.930381e-32 x = 1.00000000000000 1.00000000000000 fx = 1.00000000000000 0.77160942667725 0.58224024884105 0.40492742502160 0.32466327341368 0.28960411147824 0.07623420070067 0.06786211944378 0.03378423679313 0.00108990808914 0.00108795243321 0.00008974308332 0.00000012183819 0.00000000675602 0.00000000000000 0.00000000000000 0.00000000000000 0.00000000000000 0.00000000000000 c = 20this shows that the minimum (at x = [1 1]') was found after c = 20 iterations and fx contains the function value after each iteration. If instead we want to limit the search to 25 function evaluations:
>> [x fx c] = minimize([0 0]', 'rosenbrock', -25) Function evaluation 22; Value 6.786212e-02 x = 0.79090360290176 0.62236410995852 fx = 1.00000000000000 0.77160942667725 0.58224024884105 0.40492742502160 0.32466327341368 0.28960411147824 0.07623420070067 0.06786211944378 c = 25where the function value has been reduced to 0.068. Note that in this example each iteration (linesearch) used an average of around 3 function evaluations.
>> x0 = zeros(100, 1); >> a = fminunc('rosenbrock', x0, optimset('GradObj', 'on', 'MaxFunEval', 20)); >> disp([rosenbrock(x0) rosenbrock(a)]) 99.00000000000000 88.38645133447895reduces the function from 99 to around 88 using 2121 function and gradient evaluations (don't ask me why it calls the function 2121 times when MaxFunEval is set to 20). Using minimize:
>> x0 = zeros(100, 1); >> a = minimize(x0, 'rosenbrock', -2121); Function evaluation 2119; Value 1.851149e-25 >> disp([rosenbrock(x0) rosenbrock(a)]) 99.00000000000000 0.00000000000000showing considerably more progress in the same number of function evaluations. For larger problems, the difference is even more severe: for the 1000 dimensional Rosenbrock problem, minimize converges in 10177 function evaluations to a value of 8.296869e-27, whereas fminunc reduces the function value from 999 (at zero) to 980 in 21021 function evaluations (if you want to replicate this experiment, be aware that for some reason matlab's fminunc is computationally extremely slow for the 1000 dimensional problem).
Once you have written a function to minimize, it may be wise to check that the function values and partial derivatives a consistent. If they are not consistent, then minimize will probably claim that no more progress can be made after only a few iterations. You may use the checkgrad (checkgrad.m) function, to compare the partial derivatives with finite difference approximations. Eg, we could try:
>> randn('seed', 21); >> checkgrad('rosenbrock', randn(3,1), 1e-5) 1.0e+02 * 9.91926964244770 9.91926964286449 -3.76300059587095 -3.76300059616597 0.36091809447947 0.36091809448635 ans = 2.405447213023161e-11to verify that the partial derivatives and finite differences approximation have very close values at some random point.