Downhill.AbstractOptBuffer
— TypeAbstractOptBuffer
Abstract type for buffers used to implement optimizations routines (including stopping criteria etc.)
Downhill.BFGS
— TypeBFGS
Quasi-Newton descent method.
Downhill.BasicConvergenceStats
— TypeBasicConvergenceStats
Wrapper type to provide basic stop conditions: magnitude of gradient is less than the specified value, objective function call count exceeds threshold, iteration count exceeds threshold.
Downhill.CGDescent
— TypeCGDescent
Conjugate gradient method (Hager-Zhang version [W.Hager, H.Zhang // SIAM J. Optim (2006) Vol. 16, pp. 170-192])
Downhill.CholBFGS
— TypeCholBFGS
Quasi-Newton descent method.
Downhill.ConstrainStepSize
— TypeConstrainStepSize
Wrapper type to limit step sizes attempted in optimization, given a function (origin, direction) -> max_step
.
Downhill.FixedRateDescent
— TypeFixedRateDescent
Descent method which minimizes the objective function in the direction of antigradient at each step.
Downhill.HyperGradDescent
— TypeHyperGradDescent
Descent method which minimizes the objective function in the direction of antigradient at each step.
Downhill.MomentumDescent
— TypeMomentumDescent
Descent method which minimizes the objective function in the direction of antigradient at each step.
Downhill.NesterovMomentum
— TypeNesterovMomentum
Descent method which minimizes the objective function in the direction of antigradient at each step.
Downhill.OptBuffer
— TypeOptBuffer
Abstract type for structs designed to store the data required for core optimization logic.
Downhill.SteepestDescent
— TypeSteepestDescent
Descent method which minimizes the objective function in the direction of antigradient at each step.
Downhill.Wrapper
— TypeWrapper
Abstract type for wrappers around core buffers meant for auxiliary purposes (stopping conditions, logging etc.)
Downhill.argumentvec
— Methodargumentvec(M::AbstractOptBuffer)
Return the argument vector at the end of the optimization step.
Downhill.base_method
— Methodbase_method(M::AbstractOptBuffer)
For a Wrapper
object, return the wrapped buffer. For an OptBuffer
object, return the object itself.
Downhill.callfn!
— Methodcallfn!(fdf, M::AbstractOptBuffer, x, α, d)
Return the value of fdf(x + α * d)
overwriting the internal buffers of M
as needed. fdf(x, g)
must return a tuple (f(x), ∇f(x)) and, if g
is mutable, overwrite it with the gradient (see optimize!
).
Downhill.conv_success
— Methodconv_success(M::AbstractOptBuffer)
Decide if the optimization has converged from the state of M
(useful to distinguish between stopping by convergence and stopping by iterations count when the convergence has been achieved on the limiting iteration).
Downhill.convstat
— Methodconvstat(M::Wrapper)
For a converged state, return the statistics in the form of NamedTuple
(converged = true/false, argument, gradient, iterations, calls)
. Negative values of calls
or iterations
mean that the number has not been tracked.
Downhill.fnval
— Methodfnval(M::AbstractOptBuffer)
Return the objective function value at the end of the optimization step.
Downhill.fnval_origin
— Methodfnval_origin(M::AbstractOptBuffer)
Return the objective function value at the start of the optimization step.
Downhill.gradientvec
— Methodgradientvec(M::AbstractOptBuffer)
Return the gradient vector at the end of the optimization step.
Downhill.init!
— Methodoptfn!
must be the 3-arg closure that computes fdf(x + α*d) and overwrites M
's gradient
Downhill.mcholesky!
— Methodmcholesky!(A::AbstractMatrix)
Perform an in-place modified Cholesky decomposition on matrix A
(Gill, Murray, Wright, Practical optimization (1981), p.111)
Downhill.optimize!
— Methodoptimize!(
fdf, M::OptBuffer, x₀;
gtol=1e-6,
convcond=nothing,
maxiter=100,
maxcalls=nothing,
reset=true,
constrain_step=nothing,
tracking=nothing,
verbosity=0
)
Find an optimizer for fdf
, starting with the initial approximation x₀
.
Arguments
M::OptBuffer
: the core method to use for optimizationfdf(x, g)::Function
: function to optimize. It must return a tuple (f(x), ∇f(x)) and, ifg
is mutable, overwrite it with the gradient.x0
: initial approximation
Keywords
Convergence criteria
There are two options to specify convergence criterion. The default is by gtol
and the second by custom stop convcond
.
gtol::Real
: (default stop criterion) stop optimization when the gradient's 2-norm is lessconvcond=(x, xpre, y, ypre, g)->Bool
: function, custom stop criterion based on argument values, function values andg
radient. Ifnothing
(default), corresponds togtol
, and when specified, thegtol
-criterion is ignored.
Example (default criterion): convcond = (x, xpre, y, ypre, g) -> norm(g, 2) ≤ gtol
.
Limitting optimization
(Un)Limit optimization process by specifing either maxiter
and/or maxcalls
.
maxiter::Integer
: force stop optimization after this number of iterations (usenothing
or a negative value to not constrain iteration number)maxcalls::Integer
: force stop optimization after this number of function calls (usenothing
or a negative value to not constrain call number)
Optimization constrains
The inequality constrains of optimization is handled by constrain_step
.
constrain_step(x0, d)
: a function to constrain step fromx0
in the directiond
. It must return a real-numbered valueα
such thatx0 + αd
is the maximum allowed step
Initializing
reset=true
: a value to pass as a keyword argument to the optimizerinit!
method
Optimization path
tracking::Union{Nothing,IO,AbstractString}
: IO stream or a file name to log the optimization process ornothing
to disable logging (default:nothing
)verbosity::Integer
: verbosity of logging.0
(default) disables tracking.1
logs all points of objective function evaluation with corresponding values and gradients.2
shows additional statistics regarding the line search. Option ignored iftracking == nothing
.
Downhill.optimize!
— Methodoptimize!(fdf, M::Wrapper, x0)
Find an optimizer for fdf
, starting with the initial approximation x0
. fdf(x, g)
must return a tuple (f(x), ∇f(x)) and, if g
is mutable, overwrite it with the gradient.
Downhill.optimize
— Methodoptimize(
fdf, x₀;
method,
kw...
)
Find an optimizer for fdf
, starting with the initial approximation x₀
. method
keyword chooses a specific optimization method. See optimize!
for the description of other keywords.
Downhill.reset!
— Methodreset!(M::AbstractOptBuffer, args...; kwargs...)
Reset the solver parameters to the default (or to specific value – see the documentation for the specific types).
Each method has to implement a parameter-free reset!(M)
method.
Downhill.solver
— MethodDownhill.solver(
M::OptBuffer;
gtol = 1e-6,
maxiter = 100,
maxcalls = nothing,
constrain_step=nothing,
)
Return the wrapper object for a chosen method to solve an optimization problem with given parameters. For the description of keywords, see optimize!
Downhill.solver
— MethodDownhill.solver(
M::DataType, x;
gtol=1e-6, maxiter = 100, maxcalls = nothing, constrain_step)
Return the wrapper object for a chosen method to solve an optimization problem with given parameters compatible with the dimensions of x
.
Downhill.sqmatr
— Methodsqmatr(vec::AbstractVector, [element_type = eltype(vec)])
Create an uninitialized mutable N×N matrix with the given element type, given a vector of length N.
Downhill.step!
— Methodstep!(fdf, M::AbstractOptBuffer; kw...)
Make one iteration of optimization routine from the current state of M
using fdf
as the objective function and modifying the internal buffers as needed.
Downhill.step_origin
— Methodstep_origin(M::AbstractOptBuffer)
Return the argument vector at the start of the optimization step.
Downhill.stopcond
— Methodstopcond(M::AbstractOptBuffer)
Decide if the optimization should be stopped from the state of M
.
Downhill.strong_backtracking!
— MethodDescentMethods.strong_backtracking!(fdf, x₀, d, [y₀, g₀]; [α, αmax, β, σ])
Find α
such that f(x₀ + αd) ≤ f(x₀) + β α d⋅∇f(x)
and |d⋅∇f(x₀ + αd)| ≤ σ|d⋅∇f(x₀)|
(strong Wolfe conditions) using the backtracking line search with cubic interpolation and Hager-Zhang approximation when steps are very small. Optionally, y₀ = f(x₀)
and g₀ = ∇f(x₀)
may be provided to avoid recalculation.
Arguments
fdf
: a functionfdf(x, α, d)
returning a tuplef(x + α * d), ∇f(x + α * d)
wheref
is the minimized functionx₀::AbstractVector
: the initial pointd::AbstractVector
: the search direction. Must be a descent direction, i.e.d⋅∇f(x₀) < 0
y₀
: (optional) the value off(x₀)
if it is known beforehandg₀
: (optional) the value of∇f(x₀)
if it is known beforehand
Keywords
α=1
: the initial value ofα
αmax=Inf
: the maximum allowed value of αβ=1e-4
: the coefficient in Wolfe conditionsσ=0.5
: the coefficient in Wolfe conditions
Downhill.γξ
— MethodReturn the maximum absolute values of diagonal and off-diagonal elements of A
.