NonlinearSolve.GaussNewton
— TypeGaussNewton(; concrete_jac = nothing, linsolve = nothing, precs = DEFAULT_PRECS,
adkwargs...)
An advanced GaussNewton implementation with support for efficient handling of sparse matrices via colored automatic differentiation and preconditioned linear solvers. Designed for large-scale and numerically-difficult nonlinear least squares problems.
In most practical situations, users should prefer using LevenbergMarquardt
instead! It is a more general extension of Gauss-Newton
Method.
Keyword Arguments
autodiff
: determines the backend used for the Jacobian. Note that this argument is ignored if an analytical Jacobian is passed, as that will be used instead. Defaults toAutoForwardDiff()
. Valid choices are types from ADTypes.jl.concrete_jac
: whether to build a concrete Jacobian. If a Krylov-subspace method is used, then the Jacobian will not be constructed and instead direct Jacobian-vector productsJ*v
are computed using forward-mode automatic differentiation or finite differencing tricks (without ever constructing the Jacobian). However, if the Jacobian is still needed, for example for a preconditioner,concrete_jac = true
can be passed in order to force the construction of the Jacobian.linsolve
: the LinearSolve.jl used for the linear solves within the Newton method. Defaults tonothing
, which means it uses the LinearSolve.jl default algorithm choice. For more information on available algorithm choices, see the LinearSolve.jl documentation.precs
: the choice of preconditioners for the linear solver. Defaults to using no preconditioners. For more information on specifying preconditioners for LinearSolve algorithms, consult the LinearSolve.jl documentation.
Jacobian-Free version of GaussNewton
doesn't work yet, and it forces jacobian construction. This will be fixed in the near future.
NonlinearSolve.LevenbergMarquardt
— TypeLevenbergMarquardt(; concrete_jac = nothing, linsolve = nothing,
precs = DEFAULT_PRECS, damping_initial::Real = 1.0,
damping_increase_factor::Real = 2.0, damping_decrease_factor::Real = 3.0,
finite_diff_step_geodesic::Real = 0.1, α_geodesic::Real = 0.75,
b_uphill::Real = 1.0, min_damping_D::AbstractFloat = 1e-8, adkwargs...)
An advanced Levenberg-Marquardt implementation with the improvements suggested in the paper "Improvements to the Levenberg-Marquardt algorithm for nonlinear least-squares minimization". Designed for large-scale and numerically-difficult nonlinear systems.
Keyword Arguments
autodiff
: determines the backend used for the Jacobian. Note that this argument is ignored if an analytical Jacobian is passed, as that will be used instead. Defaults toAutoForwardDiff()
. Valid choices are types from ADTypes.jl.concrete_jac
: whether to build a concrete Jacobian. If a Krylov-subspace method is used, then the Jacobian will not be constructed and instead direct Jacobian-vector productsJ*v
are computed using forward-mode automatic differentiation or finite differencing tricks (without ever constructing the Jacobian). However, if the Jacobian is still needed, for example for a preconditioner,concrete_jac = true
can be passed in order to force the construction of the Jacobian.linsolve
: the LinearSolve.jl used for the linear solves within the Newton method. Defaults tonothing
, which means it uses the LinearSolve.jl default algorithm choice. For more information on available algorithm choices, see the LinearSolve.jl documentation.precs
: the choice of preconditioners for the linear solver. Defaults to using no preconditioners. For more information on specifying preconditioners for LinearSolve algorithms, consult the LinearSolve.jl documentation.damping_initial
: the starting value for the damping factor. The damping factor is inversely proportional to the step size. The damping factor is adjusted during each iteration. Defaults to1.0
. For more details, see section 2.1 of this paper.damping_increase_factor
: the factor by which the damping is increased if a step is rejected. Defaults to2.0
. For more details, see section 2.1 of this paper.damping_decrease_factor
: the factor by which the damping is decreased if a step is accepted. Defaults to3.0
. For more details, see section 2.1 of this paper.finite_diff_step_geodesic
: the step size used for finite differencing used to calculate the geodesic acceleration. Defaults to0.1
which means that the step size is approximately 10% of the first-order step. For more details, see section 3 of this paper.α_geodesic
: a factor that determines if a step is accepted or rejected. To incorporate geodesic acceleration as an addition to the Levenberg-Marquardt algorithm, it is necessary that acceptable steps meet the condition $\frac{2||a||}{||v||} \le \alpha_{\text{geodesic}}$, where $a$ is the geodesic acceleration, $v$ is the Levenberg-Marquardt algorithm's step (velocity along a geodesic path) andα_geodesic
is some number of order1
. For most problemsα_geodesic = 0.75
is a good value but for problems where convergence is difficultα_geodesic = 0.1
is an effective choice. Defaults to0.75
. For more details, see section 3, equation (15) of this paper.b_uphill
: a factor that determines if a step is accepted or rejected. The standard choice in the Levenberg-Marquardt method is to accept all steps that decrease the cost and reject all steps that increase the cost. Although this is a natural and safe choice, it is often not the most efficient. Therefore downhill moves are always accepted, but uphill moves are only conditionally accepted. To decide whether an uphill move will be accepted at each iteration $i$, we compute $\beta_i = \cos(v_{\text{new}}, v_{\text{old}})$, which denotes the cosine angle between the proposed velocity $v_{\text{new}}$ and the velocity of the last accepted step $v_{\text{old}}$. The idea is to accept uphill moves if the angle is small. To specify, uphill moves are accepted if $(1-\beta_i)^{b_{\text{uphill}}} C_{i+1} \le C_i$, where $C_i$ is the cost at iteration $i$. Reasonable choices forb_uphill
are1.0
or2.0
, withb_uphill=2.0
allowing higher uphill moves thanb_uphill=1.0
. Whenb_uphill=0.0
, no uphill moves will be accepted. Defaults to1.0
. For more details, see section 4 of this paper.min_damping_D
: the minimum value of the damping terms in the diagonal damping matrixDᵀD
, whereDᵀD
is given by the largest diagonal entries ofJᵀJ
yet encountered, whereJ
is the Jacobian. It is suggested by this paper to use a minimum value of the elements inDᵀD
to prevent the damping from being too small. Defaults to1e-8
.
NonlinearSolve.LineSearch
— TypeLineSearch(method = Static(), autodiff = AutoFiniteDiff(), alpha = true)
Wrapper over algorithms from LineSeaches.jl. Allows automatic construction of the objective functions for the line search algorithms utilizing automatic differentiation for fast Vector Jacobian Products.
Arguments
method
: the line search algorithm to use. Defaults toStatic()
, which means that the step size is fixed to the value ofalpha
.autodiff
: the automatic differentiation backend to use for the line search. Defaults toAutoFiniteDiff()
, which means that finite differencing is used to compute the VJP.AutoZygote()
will be faster in most cases, but it requiresZygote.jl
to be manually installed and loadedalpha
: the initial step size to use. Defaults totrue
(which is equivalent to1
).
NonlinearSolve.NewtonRaphson
— TypeNewtonRaphson(; concrete_jac = nothing, linsolve = nothing,
precs = DEFAULT_PRECS, adkwargs...)
An advanced NewtonRaphson implementation with support for efficient handling of sparse matrices via colored automatic differentiation and preconditioned linear solvers. Designed for large-scale and numerically-difficult nonlinear systems.
Keyword Arguments
autodiff
: determines the backend used for the Jacobian. Note that this argument is ignored if an analytical Jacobian is passed, as that will be used instead. Defaults toAutoForwardDiff()
. Valid choices are types from ADTypes.jl.concrete_jac
: whether to build a concrete Jacobian. If a Krylov-subspace method is used, then the Jacobian will not be constructed and instead direct Jacobian-vector productsJ*v
are computed using forward-mode automatic differentiation or finite differencing tricks (without ever constructing the Jacobian). However, if the Jacobian is still needed, for example for a preconditioner,concrete_jac = true
can be passed in order to force the construction of the Jacobian.linsolve
: the LinearSolve.jl used for the linear solves within the Newton method. Defaults tonothing
, which means it uses the LinearSolve.jl default algorithm choice. For more information on available algorithm choices, see the LinearSolve.jl documentation.precs
: the choice of preconditioners for the linear solver. Defaults to using no preconditioners. For more information on specifying preconditioners for LinearSolve algorithms, consult the LinearSolve.jl documentation.linesearch
: the line search algorithm to use. Defaults toLineSearch()
, which means that no line search is performed. Algorithms fromLineSearches.jl
can be used here directly, and they will be converted to the correctLineSearch
.
NonlinearSolve.TrustRegion
— Type```julia TrustRegion(; concretejac = nothing, linsolve = nothing, precs = DEFAULTPRECS, radiusupdatescheme::RadiusUpdateSchemes.T = RadiusUpdateSchemes.Simple, maxtrustradius::Real = 0 // 1, initialtrustradius::Real = 0 // 1, stepthreshold::Real = 1 // 10, shrinkthreshold::Real = 1 // 4, expandthreshold::Real = 3 // 4, shrinkfactor::Real = 1 // 4, expandfactor::Real = 2 // 1, maxshrink_times::Int = 32, adkwargs...)
An advanced TrustRegion implementation with support for efficient handling of sparse matrices via colored automatic differentiation and preconditioned linear solvers. Designed for large-scale and numerically-difficult nonlinear systems.
Keyword Arguments
autodiff
: determines the backend used for the Jacobian. Note that this argument is ignored if an analytical Jacobian is passed, as that will be used instead. Defaults toAutoForwardDiff()
. Valid choices are types from ADTypes.jl.concrete_jac
: whether to build a concrete Jacobian. If a Krylov-subspace method is used, then the Jacobian will not be constructed and instead direct Jacobian-vector productsJ*v
are computed using forward-mode automatic differentiation or finite differencing tricks (without ever constructing the Jacobian). However, if the Jacobian is still needed, for example for a preconditioner,concrete_jac = true
can be passed in order to force the construction of the Jacobian.linsolve
: the LinearSolve.jl used for the linear solves within the Newton method. Defaults tonothing
, which means it uses the LinearSolve.jl default algorithm choice. For more information on available algorithm choices, see the LinearSolve.jl documentation.precs
: the choice of preconditioners for the linear solver. Defaults to using no preconditioners. For more information on specifying preconditioners for LinearSolve algorithms, consult the LinearSolve.jl documentation.radius_update_scheme
: the choice of radius update scheme to be used. Defaults toRadiusUpdateSchemes.Simple
which follows the conventional approach. Other available schemes areRadiusUpdateSchemes.Hei
,RadiusUpdateSchemes.Yuan
,RadiusUpdateSchemes.Bastin
,RadiusUpdateSchemes.Fan
. These schemes have the trust region radius converging to zero that is seen to improve convergence. For more details, see the Yuan, Yx.max_trust_radius
: the maximal trust region radius. Defaults tomax(norm(fu), maximum(u) - minimum(u))
.initial_trust_radius
: the initial trust region radius. Defaults tomax_trust_radius / 11
.step_threshold
: the threshold for taking a step. In every iteration, the threshold is compared with a valuer
, which is the actual reduction in the objective function divided by the predicted reduction. Ifstep_threshold > r
the model is not a good approximation, and the step is rejected. Defaults to0.1
. For more details, see Rahpeymaii, F.shrink_threshold
: the threshold for shrinking the trust region radius. In every iteration, the threshold is compared with a valuer
which is the actual reduction in the objective function divided by the predicted reduction. Ifshrink_threshold > r
the trust region radius is shrunk byshrink_factor
. Defaults to0.25
. For more details, see Rahpeymaii, F.expand_threshold
: the threshold for expanding the trust region radius. If a step is taken, i.estep_threshold < r
(withr
defined inshrink_threshold
), a check is also made to see ifexpand_threshold < r
. If that is true, the trust region radius is expanded byexpand_factor
. Defaults to0.75
.shrink_factor
: the factor to shrink the trust region radius with ifshrink_threshold > r
(withr
defined inshrink_threshold
). Defaults to0.25
.expand_factor
: the factor to expand the trust region radius with ifexpand_threshold < r
(withr
defined inshrink_threshold
). Defaults to2.0
.max_shrink_times
: the maximum number of times to shrink the trust region radius in a row,max_shrink_times
is exceeded, the algorithm returns. Defaults to32
.
linsolve
and precs
are used exclusively for the inplace version of the algorithm. Support for the OOP version is planned!
NonlinearSolve.default_adargs_to_adtype
— Methoddefault_adargs_to_adtype(; chunk_size = Val{0}(), autodiff = Val{true}(),
standardtag = Val{true}(), diff_type = Val{:forward})
Construct the AD type from the arguments. This is mostly needed for compatibility with older code.
NonlinearSolve.value_derivative
— Methodvalue_derivative(f, x)
Compute f(x), d/dx f(x)
in the most efficient way.
NonlinearSolve.RadiusUpdateSchemes
— ModuleRadiusUpdateSchemes
RadiusUpdateSchemes
is the standard enum interface for different types of radius update schemes implemented in the Trust Region method. These schemes specify how the radius of the so-called trust region is updated after each iteration of the algorithm. The specific role and caveats associated with each scheme are provided below.
Using RadiusUpdateSchemes
RadiusUpdateSchemes
uses the standard EnumX interface (https://github.com/fredrikekre/EnumX.jl), and hence inherits all properties of being an EnumX, including the type of each constituent enum states as RadiusUpdateSchemes.T
. Simply put the desired scheme as follows: TrustRegion(radius_update_scheme = your desired update scheme)
. For example, sol = solve(prob, alg=TrustRegion(radius_update_scheme = RadiusUpdateSchemes.Hei))
.
NonlinearSolve.RadiusUpdateSchemes.Bastin
— ConstantRadiusUpdateSchemes.Bastin
This scheme is proposed by Bastin, et al.. The scheme is called a retrospective update scheme as it uses the model function at the current iteration to compute the ratio of the actual reduction and the predicted reduction in the previous trial step, and use this ratio to update the trust region radius. The hypothesis is to exploit the information made available during the optimization process in order to vary the accuracy of the objective function computation.
NonlinearSolve.RadiusUpdateSchemes.Fan
— ConstantRadiusUpdateSchemes.Fan
This scheme is proposed by Fan, J.. It is very much similar to Hei's and Yuan's schemes as it lets the trust region radius depend on the current size (norm) of the objective (merit) function itself. These new update schemes are known to improve local convergence.
NonlinearSolve.RadiusUpdateSchemes.Hei
— ConstantRadiusUpdateSchemes.Hei
This scheme is proposed by Hei, L.. The trust region radius depends on the size (norm) of the current step size. The hypothesis is to let the radius converge to zero as the iterations progress, which is more reliable and robust for ill-conditioned as well as degenerate problems.
NonlinearSolve.RadiusUpdateSchemes.NLsolve
— ConstantRadiusUpdateSchemes.NLsolve
The same updating scheme as in NLsolve's (https://github.com/JuliaNLSolvers/NLsolve.jl) trust region dogleg implementation.
NonlinearSolve.RadiusUpdateSchemes.NocedalWright
— ConstantRadiusUpdateSchemes.NocedalWright
Trust region updating scheme as in Nocedal and Wright [see Alg 11.5, page 291].
NonlinearSolve.RadiusUpdateSchemes.Simple
— ConstantRadiusUpdateSchemes.Simple
The simple or conventional radius update scheme. This scheme is chosen by default and follows the conventional approach to update the trust region radius, i.e. if the trial step is accepted it increases the radius by a fixed factor (bounded by a maximum radius) and if the trial step is rejected, it shrinks the radius by a fixed factor.
NonlinearSolve.RadiusUpdateSchemes.Yuan
— ConstantRadiusUpdateSchemes.Yuan
This scheme is proposed by Yuan, Y.. Similar to Hei's scheme, the trust region is updated in a way so that it converges to zero, however here, the radius depends on the size (norm) of the current gradient of the objective (merit) function. The hypothesis is that the step size is bounded by the gradient size, so it makes sense to let the radius depend on the gradient.