Notes on Optimization
Last Modified 2024-01-01
One-Dimensional Optimization
In this section, we consider the problem of minimizing a continuous function . More specifically, we want to devise an algorithm to search for a minimizer of starting from one or more specified initial points. Different types of initial data (three-point brackets, two-point brackets, etc.) will lead to different classes of algorithms.
We will not assume any particular form for the function , and we will develop algorithms that only access as a black box. In other words, given a real number , we can call a subroutine that returns the numerical value of . This output value is the only feature of the subroutine that we can observe; we cannot otherwise look inside to see what it is doing.
Under these conditions, any algorithm that terminates can only perform a finite number of evaluations of . Hence, no algorithm of this type can provide any guarantees about finding a global minimizer or computing an exact result. Even in the one-dimensional case, the best we can hope to do is to approximate a local minimizer of .
Bracketing and Interpolation
Suppose we have a three-point bracket for , i.e., an ordered triple of real numbers such that
Using this data, we can construct a quadratic interpolating polynomial that passes through the points , , and .
Here, we have defined , , , and as follows:
Now, if is well-approximated by in the interval , then we might expect a minimizer of to closely approximate a minimizer of . There is a simple formula for the exact (global) minimizer of :
Observe that is well-defined as long as and are not both zero. Indeed, if , then this technique breaks down because is a constant polynomial. Moreover, if the bracket conditions and are satisfied, then is guaranteed to lie in the interval , and .
The following Mathematica code verifies the correctness of these claims.
On[Assert];
p[x_] := (Δ₂ - Δ₁)/(h₁ + h₂)*(x - x₀)^2 -
(h₁*Δ₂ - h₂*Δ₁ - 2h₁*Δ₁)/(h₁ + h₂)*(x - x₀) + f₀;
xQuadratic = x₀ + (h₁*Δ₂ - h₂*Δ₁ - 2h₁*Δ₁)/(2Δ₂ - 2Δ₁);
Assert[p[x₀] == f₀ // Simplify];
Assert[p[x₀ + h₁] == f₀ + h₁*Δ₁ // Simplify];
Assert[p[x₀ + h₁ + h₂] == f₀ + h₁*Δ₁ + h₂*Δ₂ // Simplify];
Assert[p'[xQuadratic] == 0 // Simplify];
Assuming[h₁ > 0 && h₂ > 0 && Δ₁ <= 0 && Δ₂ >= 0,
Assert[p''[xQuadratic] >= 0 // Simplify];
Assert[x₀ <= xQuadratic // Simplify];
Assert[xQuadratic <= x₀ + h₁ + h₂ // Simplify];
];