NonlinearSolve.GaussNewtonType
GaussNewton(; concrete_jac = nothing, linsolve = nothing, precs = DEFAULT_PRECS,
    adkwargs...)

An advanced GaussNewton implementation with support for efficient handling of sparse matrices via colored automatic differentiation and preconditioned linear solvers. Designed for large-scale and numerically-difficult nonlinear least squares problems.

Note

In most practical situations, users should prefer using LevenbergMarquardt instead! It is a more general extension of Gauss-Newton Method.

Keyword Arguments

  • autodiff: determines the backend used for the Jacobian. Note that this argument is ignored if an analytical Jacobian is passed, as that will be used instead. Defaults to AutoForwardDiff(). Valid choices are types from ADTypes.jl.
  • concrete_jac: whether to build a concrete Jacobian. If a Krylov-subspace method is used, then the Jacobian will not be constructed and instead direct Jacobian-vector products J*v are computed using forward-mode automatic differentiation or finite differencing tricks (without ever constructing the Jacobian). However, if the Jacobian is still needed, for example for a preconditioner, concrete_jac = true can be passed in order to force the construction of the Jacobian.
  • linsolve: the LinearSolve.jl used for the linear solves within the Newton method. Defaults to nothing, which means it uses the LinearSolve.jl default algorithm choice. For more information on available algorithm choices, see the LinearSolve.jl documentation.
  • precs: the choice of preconditioners for the linear solver. Defaults to using no preconditioners. For more information on specifying preconditioners for LinearSolve algorithms, consult the LinearSolve.jl documentation.
Warning

Jacobian-Free version of GaussNewton doesn't work yet, and it forces jacobian construction. This will be fixed in the near future.

NonlinearSolve.LevenbergMarquardtType
LevenbergMarquardt(; concrete_jac = nothing, linsolve = nothing,
    precs = DEFAULT_PRECS, damping_initial::Real = 1.0,
    damping_increase_factor::Real = 2.0, damping_decrease_factor::Real = 3.0,
    finite_diff_step_geodesic::Real = 0.1, α_geodesic::Real = 0.75,
    b_uphill::Real = 1.0, min_damping_D::AbstractFloat = 1e-8, adkwargs...)

An advanced Levenberg-Marquardt implementation with the improvements suggested in the paper "Improvements to the Levenberg-Marquardt algorithm for nonlinear least-squares minimization". Designed for large-scale and numerically-difficult nonlinear systems.

Keyword Arguments

  • autodiff: determines the backend used for the Jacobian. Note that this argument is ignored if an analytical Jacobian is passed, as that will be used instead. Defaults to AutoForwardDiff(). Valid choices are types from ADTypes.jl.
  • concrete_jac: whether to build a concrete Jacobian. If a Krylov-subspace method is used, then the Jacobian will not be constructed and instead direct Jacobian-vector products J*v are computed using forward-mode automatic differentiation or finite differencing tricks (without ever constructing the Jacobian). However, if the Jacobian is still needed, for example for a preconditioner, concrete_jac = true can be passed in order to force the construction of the Jacobian.
  • linsolve: the LinearSolve.jl used for the linear solves within the Newton method. Defaults to nothing, which means it uses the LinearSolve.jl default algorithm choice. For more information on available algorithm choices, see the LinearSolve.jl documentation.
  • precs: the choice of preconditioners for the linear solver. Defaults to using no preconditioners. For more information on specifying preconditioners for LinearSolve algorithms, consult the LinearSolve.jl documentation.
  • damping_initial: the starting value for the damping factor. The damping factor is inversely proportional to the step size. The damping factor is adjusted during each iteration. Defaults to 1.0. For more details, see section 2.1 of this paper.
  • damping_increase_factor: the factor by which the damping is increased if a step is rejected. Defaults to 2.0. For more details, see section 2.1 of this paper.
  • damping_decrease_factor: the factor by which the damping is decreased if a step is accepted. Defaults to 3.0. For more details, see section 2.1 of this paper.
  • finite_diff_step_geodesic: the step size used for finite differencing used to calculate the geodesic acceleration. Defaults to 0.1 which means that the step size is approximately 10% of the first-order step. For more details, see section 3 of this paper.
  • α_geodesic: a factor that determines if a step is accepted or rejected. To incorporate geodesic acceleration as an addition to the Levenberg-Marquardt algorithm, it is necessary that acceptable steps meet the condition $\frac{2||a||}{||v||} \le \alpha_{\text{geodesic}}$, where $a$ is the geodesic acceleration, $v$ is the Levenberg-Marquardt algorithm's step (velocity along a geodesic path) and α_geodesic is some number of order 1. For most problems α_geodesic = 0.75 is a good value but for problems where convergence is difficult α_geodesic = 0.1 is an effective choice. Defaults to 0.75. For more details, see section 3, equation (15) of this paper.
  • b_uphill: a factor that determines if a step is accepted or rejected. The standard choice in the Levenberg-Marquardt method is to accept all steps that decrease the cost and reject all steps that increase the cost. Although this is a natural and safe choice, it is often not the most efficient. Therefore downhill moves are always accepted, but uphill moves are only conditionally accepted. To decide whether an uphill move will be accepted at each iteration $i$, we compute $\beta_i = \cos(v_{\text{new}}, v_{\text{old}})$, which denotes the cosine angle between the proposed velocity $v_{\text{new}}$ and the velocity of the last accepted step $v_{\text{old}}$. The idea is to accept uphill moves if the angle is small. To specify, uphill moves are accepted if $(1-\beta_i)^{b_{\text{uphill}}} C_{i+1} \le C_i$, where $C_i$ is the cost at iteration $i$. Reasonable choices for b_uphill are 1.0 or 2.0, with b_uphill=2.0 allowing higher uphill moves than b_uphill=1.0. When b_uphill=0.0, no uphill moves will be accepted. Defaults to 1.0. For more details, see section 4 of this paper.
  • min_damping_D: the minimum value of the damping terms in the diagonal damping matrix DᵀD, where DᵀD is given by the largest diagonal entries of JᵀJ yet encountered, where J is the Jacobian. It is suggested by this paper to use a minimum value of the elements in DᵀD to prevent the damping from being too small. Defaults to 1e-8.
NonlinearSolve.LineSearchType
LineSearch(method = Static(), autodiff = AutoFiniteDiff(), alpha = true)

Wrapper over algorithms from LineSeaches.jl. Allows automatic construction of the objective functions for the line search algorithms utilizing automatic differentiation for fast Vector Jacobian Products.

Arguments

  • method: the line search algorithm to use. Defaults to Static(), which means that the step size is fixed to the value of alpha.
  • autodiff: the automatic differentiation backend to use for the line search. Defaults to AutoFiniteDiff(), which means that finite differencing is used to compute the VJP. AutoZygote() will be faster in most cases, but it requires Zygote.jl to be manually installed and loaded
  • alpha: the initial step size to use. Defaults to true (which is equivalent to 1).
NonlinearSolve.NewtonRaphsonType
NewtonRaphson(; concrete_jac = nothing, linsolve = nothing,
    precs = DEFAULT_PRECS, adkwargs...)

An advanced NewtonRaphson implementation with support for efficient handling of sparse matrices via colored automatic differentiation and preconditioned linear solvers. Designed for large-scale and numerically-difficult nonlinear systems.

Keyword Arguments

  • autodiff: determines the backend used for the Jacobian. Note that this argument is ignored if an analytical Jacobian is passed, as that will be used instead. Defaults to AutoForwardDiff(). Valid choices are types from ADTypes.jl.
  • concrete_jac: whether to build a concrete Jacobian. If a Krylov-subspace method is used, then the Jacobian will not be constructed and instead direct Jacobian-vector products J*v are computed using forward-mode automatic differentiation or finite differencing tricks (without ever constructing the Jacobian). However, if the Jacobian is still needed, for example for a preconditioner, concrete_jac = true can be passed in order to force the construction of the Jacobian.
  • linsolve: the LinearSolve.jl used for the linear solves within the Newton method. Defaults to nothing, which means it uses the LinearSolve.jl default algorithm choice. For more information on available algorithm choices, see the LinearSolve.jl documentation.
  • precs: the choice of preconditioners for the linear solver. Defaults to using no preconditioners. For more information on specifying preconditioners for LinearSolve algorithms, consult the LinearSolve.jl documentation.
  • linesearch: the line search algorithm to use. Defaults to LineSearch(), which means that no line search is performed. Algorithms from LineSearches.jl can be used here directly, and they will be converted to the correct LineSearch.
NonlinearSolve.TrustRegionType

```julia TrustRegion(; concretejac = nothing, linsolve = nothing, precs = DEFAULTPRECS, radiusupdatescheme::RadiusUpdateSchemes.T = RadiusUpdateSchemes.Simple, maxtrustradius::Real = 0 // 1, initialtrustradius::Real = 0 // 1, stepthreshold::Real = 1 // 10, shrinkthreshold::Real = 1 // 4, expandthreshold::Real = 3 // 4, shrinkfactor::Real = 1 // 4, expandfactor::Real = 2 // 1, maxshrink_times::Int = 32, adkwargs...)

An advanced TrustRegion implementation with support for efficient handling of sparse matrices via colored automatic differentiation and preconditioned linear solvers. Designed for large-scale and numerically-difficult nonlinear systems.

Keyword Arguments

  • autodiff: determines the backend used for the Jacobian. Note that this argument is ignored if an analytical Jacobian is passed, as that will be used instead. Defaults to AutoForwardDiff(). Valid choices are types from ADTypes.jl.
  • concrete_jac: whether to build a concrete Jacobian. If a Krylov-subspace method is used, then the Jacobian will not be constructed and instead direct Jacobian-vector products J*v are computed using forward-mode automatic differentiation or finite differencing tricks (without ever constructing the Jacobian). However, if the Jacobian is still needed, for example for a preconditioner, concrete_jac = true can be passed in order to force the construction of the Jacobian.
  • linsolve: the LinearSolve.jl used for the linear solves within the Newton method. Defaults to nothing, which means it uses the LinearSolve.jl default algorithm choice. For more information on available algorithm choices, see the LinearSolve.jl documentation.
  • precs: the choice of preconditioners for the linear solver. Defaults to using no preconditioners. For more information on specifying preconditioners for LinearSolve algorithms, consult the LinearSolve.jl documentation.
  • radius_update_scheme: the choice of radius update scheme to be used. Defaults to RadiusUpdateSchemes.Simple which follows the conventional approach. Other available schemes are RadiusUpdateSchemes.Hei, RadiusUpdateSchemes.Yuan, RadiusUpdateSchemes.Bastin, RadiusUpdateSchemes.Fan. These schemes have the trust region radius converging to zero that is seen to improve convergence. For more details, see the Yuan, Yx.
  • max_trust_radius: the maximal trust region radius. Defaults to max(norm(fu), maximum(u) - minimum(u)).
  • initial_trust_radius: the initial trust region radius. Defaults to max_trust_radius / 11.
  • step_threshold: the threshold for taking a step. In every iteration, the threshold is compared with a value r, which is the actual reduction in the objective function divided by the predicted reduction. If step_threshold > r the model is not a good approximation, and the step is rejected. Defaults to 0.1. For more details, see Rahpeymaii, F.
  • shrink_threshold: the threshold for shrinking the trust region radius. In every iteration, the threshold is compared with a value r which is the actual reduction in the objective function divided by the predicted reduction. If shrink_threshold > r the trust region radius is shrunk by shrink_factor. Defaults to 0.25. For more details, see Rahpeymaii, F.
  • expand_threshold: the threshold for expanding the trust region radius. If a step is taken, i.e step_threshold < r (with r defined in shrink_threshold), a check is also made to see if expand_threshold < r. If that is true, the trust region radius is expanded by expand_factor. Defaults to 0.75.
  • shrink_factor: the factor to shrink the trust region radius with if shrink_threshold > r (with r defined in shrink_threshold). Defaults to 0.25.
  • expand_factor: the factor to expand the trust region radius with if expand_threshold < r (with r defined in shrink_threshold). Defaults to 2.0.
  • max_shrink_times: the maximum number of times to shrink the trust region radius in a row, max_shrink_times is exceeded, the algorithm returns. Defaults to 32.
Warning

linsolve and precs are used exclusively for the inplace version of the algorithm. Support for the OOP version is planned!

NonlinearSolve.default_adargs_to_adtypeMethod
default_adargs_to_adtype(; chunk_size = Val{0}(), autodiff = Val{true}(),
    standardtag = Val{true}(), diff_type = Val{:forward})

Construct the AD type from the arguments. This is mostly needed for compatibility with older code.

NonlinearSolve.RadiusUpdateSchemesModule

RadiusUpdateSchemes

RadiusUpdateSchemes is the standard enum interface for different types of radius update schemes implemented in the Trust Region method. These schemes specify how the radius of the so-called trust region is updated after each iteration of the algorithm. The specific role and caveats associated with each scheme are provided below.

Using RadiusUpdateSchemes

RadiusUpdateSchemes uses the standard EnumX interface (https://github.com/fredrikekre/EnumX.jl), and hence inherits all properties of being an EnumX, including the type of each constituent enum states as RadiusUpdateSchemes.T. Simply put the desired scheme as follows: TrustRegion(radius_update_scheme = your desired update scheme). For example, sol = solve(prob, alg=TrustRegion(radius_update_scheme = RadiusUpdateSchemes.Hei)).

NonlinearSolve.RadiusUpdateSchemes.BastinConstant

RadiusUpdateSchemes.Bastin

This scheme is proposed by Bastin, et al.. The scheme is called a retrospective update scheme as it uses the model function at the current iteration to compute the ratio of the actual reduction and the predicted reduction in the previous trial step, and use this ratio to update the trust region radius. The hypothesis is to exploit the information made available during the optimization process in order to vary the accuracy of the objective function computation.

NonlinearSolve.RadiusUpdateSchemes.FanConstant

RadiusUpdateSchemes.Fan

This scheme is proposed by Fan, J.. It is very much similar to Hei's and Yuan's schemes as it lets the trust region radius depend on the current size (norm) of the objective (merit) function itself. These new update schemes are known to improve local convergence.

NonlinearSolve.RadiusUpdateSchemes.HeiConstant

RadiusUpdateSchemes.Hei

This scheme is proposed by Hei, L.. The trust region radius depends on the size (norm) of the current step size. The hypothesis is to let the radius converge to zero as the iterations progress, which is more reliable and robust for ill-conditioned as well as degenerate problems.

NonlinearSolve.RadiusUpdateSchemes.SimpleConstant

RadiusUpdateSchemes.Simple

The simple or conventional radius update scheme. This scheme is chosen by default and follows the conventional approach to update the trust region radius, i.e. if the trial step is accepted it increases the radius by a fixed factor (bounded by a maximum radius) and if the trial step is rejected, it shrinks the radius by a fixed factor.

NonlinearSolve.RadiusUpdateSchemes.YuanConstant

RadiusUpdateSchemes.Yuan

This scheme is proposed by Yuan, Y.. Similar to Hei's scheme, the trust region is updated in a way so that it converges to zero, however here, the radius depends on the size (norm) of the current gradient of the objective (merit) function. The hypothesis is that the step size is bounded by the gradient size, so it makes sense to let the radius depend on the gradient.