Reference/API

The Roots package provides several different algorithms to solve f(x)=0.

Roots.A42
Roots.AbstractThukralBMethod
Roots.AlefeldPotraShi
Roots.Bisection
Roots.Brent
Roots.Chandrapatla
Roots.ChebyshevLike
Roots.Esser
Roots.FalsePosition
Roots.Halley
Roots.ITP
Roots.King
Roots.LithBoonkkampIJzerman
Roots.LithBoonkkampIJzermanBracket
Roots.Newton
Roots.Order0
Roots.Order16
Roots.Order5
Roots.Order8
Roots.QuadraticInverse
Roots.Ridders
Roots.Schroder
Roots.Secant
Roots.Steffensen
Roots.SuperHalley
Roots.Tracks
Roots.ZeroProblem
CommonSolve.solve!
Roots.assess_convergence
Roots.bisection
Roots.default_tolerances
Roots.dfree
Roots.find_zero
Roots.find_zeros
Roots.fzero
Roots.muller
Roots.newton
Roots.secant_method

The `find_zero` and `find_zeros` functions

There are two main functions: find_zero to identify a zero of $f$ given some initial starting value or bracketing interval and find_zeros to heuristically identify all zeros in a specified interval.

Roots.find_zero — Function

find_zero(f, x0, M, [N::AbstractBracketingMethod]; kwargs...)

Interface to one of several methods for finding zeros of a univariate function, e.g. solving $f(x)=0$.

Initial starting value

For most methods, x0 is a scalar value indicating the initial value in the iterative procedure. (Secant methods can have a tuple specify their initial values.) Values must be a subtype of Number and have methods for float, real, and oneunit defined.

For bracketing intervals, x0 is specified using a tuple, a vector, or any iterable with extrema defined. A bracketing interval, $[a,b]$, is one where $f(a)$ and $f(b)$ have different signs.

Return value

If the algorithm suceeds, the approximate root identified is returned. A ConvergenceFailed error is thrown if the algorithm fails. The alternate form solve(ZeroProblem(f,x0), M) returns NaN in case of failure.

Specifying a method

A method is specified to indicate which algorithm to employ:

There are methods where a bracket is specified: Bisection, A42, AlefeldPotraShi, Roots.Brent, among others. Bisection is the default, but A42 generally requires far fewer iterations.
There are several derivative-free methods: cf. Order0, Order1 (also Roots.Secant), Order2 (also Steffensen), Order5, Order8, and Order16, where the number indicates the order of the convergence. Methods Roots.Order1B and Roots.Order2B are useful when the desired zero has a multiplicity.
There are some classical methods where derivatives are required: Roots.Newton, Roots.Halley, Roots.Schroder, among others.
The family Roots.LithBoonkkampIJzerman{S,D} ,for different S and D, uses a linear multistep method root finder. The (2,0) method is the secant method, (1,1) is Newton's method.

For more detail, see the help page for each method (e.g., ?Order1). Non-exported methods must be qualified with module name, as in ?Roots.Schroder.

If no method is specified, the default method depends on x0:

If x0 is a scalar, the default is the more robust Order0 method.
If x0 is a tuple, vector, or iterable with extrema defined indicating a bracketing interval, then the Bisection method is used.

The default methods are chosen to be robust; they may not be as efficient as some others.

Specifying the function

The function(s) are passed as the first argument.

For the few methods that use one or more derivatives (Newton, Halley, Schroder, LithBoonkkampIJzerman(S,D), etc.) a tuple of functions is used. For the classical algorithms, a function returning (f(x), f(x)/f'(x), [f'(x)/f''(x)]) may be used.

Optional arguments (tolerances, limit evaluations, tracing)

xatol - absolute tolerance for x values.
xrtol - relative tolerance for x values.
atol - absolute tolerance for f(x) values.
rtol - relative tolerance for f(x) values.
maxiters - limit on maximum number of iterations.
strict - if false (the default), when the algorithm stops, possible zeros are checked with a relaxed tolerance.
verbose - if true a trace of the algorithm will be shown on successful completion. See the internal Tracks object to save this trace.

See the help string for Roots.assess_convergence for details on convergence. See the help page for Roots.default_tolerances(method) for details on the default tolerances.

In general, with floating point numbers, convergence must be understood as not an absolute statement. Even if mathematically α is an answer and xstar the floating point realization, it may be that f(xstar) - f(α) ≈ xstar ⋅ f'(α) ⋅ eps(α), so the role of tolerances must be appreciated, and at times specified.

For the Bisection methods, convergence is guaranteed over Float64 values, so the tolerances are set to be $0$ by default.

If a bracketing method is passed in after the method specification, then whenever a bracket is identified during the algorithm, the method will switch to the bracketing method to identify the zero. (Bracketing methods are mathematically guaranteed to converge, non-bracketing methods may or may not converge.) This is what Order0 does by default, with an initial secant method switching to the AlefeldPotraShi method should a bracket be encountered.

Note: The order of the method is hinted at in the naming scheme. A scheme is order r if, with eᵢ = xᵢ - α, eᵢ₊₁ = C⋅eᵢʳ. If the error eᵢ is small enough, then essentially the error will gain r times as many leading zeros each step. However, if the error is not small, this will not be the case. Without good initial guesses, a high order method may still converge slowly, if at all. The OrderN methods have some heuristics employed to ensure a wider range for convergence at the cost of not faithfully implementing the method, though those are available through unexported methods.

Examples:

Default methods.

julia> using Roots

julia> find_zero(sin, 3)  # use Order0()
3.141592653589793

julia> find_zero(sin, (3,4)) # use Bisection()
3.141592653589793

Specifying a method,

julia> find_zero(sin, (3,4), Order1())            # can specify two starting points for secant method
3.141592653589793

julia> find_zero(sin, 3.0, Order2())              # Use Steffensen method
3.1415926535897936

julia> find_zero(sin, big(3.0), Order16())        # rapid convergence
3.141592653589793238462643383279502884197169399375105820974944592307816406286198

julia> find_zero(sin, (3, 4), A42())              # fewer function calls than Bisection(), in this case
3.141592653589793

julia> find_zero(sin, (3, 4), FalsePosition(8))   # 1 of 12 possible algorithms for false position
3.141592653589793

julia> find_zero((sin,cos), 3.0, Roots.Newton())  # use Newton's method
3.141592653589793

julia> find_zero((sin, cos, x->-sin(x)), 3.0, Roots.Halley())  # use Halley's method
3.141592653589793

Changing tolerances.

julia> fn = x -> (2x*cos(x) + x^2 - 3)^10/(x^2 + 1);

julia> x0, xstar = 3.0,  2.9947567209477;

julia> fn(find_zero(fn, x0, Order2())) <= 1e-14  # f(xₙ) ≈ 0, but Δxₙ can be largish
true

julia> find_zero(fn, x0, Order2(), atol=0.0, rtol=0.0) # error: x_n ≉ x_{n-1}; just f(x_n) ≈ 0
ERROR: Roots.ConvergenceFailed("Algorithm failed to converge")
[...]

julia> fn = x -> (sin(x)*cos(x) - x^3 + 1)^9;

julia> x0, xstar = 1.0,  1.112243913023029;

julia> find_zero(fn, x0, Order2()) ≈ xstar
true

julia> find_zero(fn, x0, Order2(), maxiters=3)    # need more steps to converge
ERROR: Roots.ConvergenceFailed("Algorithm failed to converge")
[...]

Tracing

Passing verbose=true will show details on the steps of the algorithm. The tracks argument allows the passing of a Tracks object to record the values of x and f(x) used in the algorithm.

Note

See solve! and ZeroProblem for an alternate interface.

Roots.find_zeros — Function

find_zeros(f, a, b=nothing; [no_pts=12, k=8, naive=false, xatol, xrtol, atol, rtol])

Search for zeros of f in the interval [a,b]. This interval can be specified with two values or using a single value, such as a tuple or vector, for which extrema returns two distinct values in increasing order.

Examples

julia> using Roots

julia> find_zeros(x -> exp(x) - x^4, -5, 20)        # a few well-spaced zeros
3-element Vector{Float64}:
 -0.8155534188089606
  1.4296118247255556
  8.613169456441398

julia> find_zeros(x -> sin(x^2) + cos(x)^2, 0, 2pi)  # many zeros
12-element Vector{Float64}:
 1.78518032659534
 2.391345462376604
 3.2852368649448853
 3.3625557095737544
 4.016412952618305
 4.325091924521049
 4.68952781386834
 5.00494459113514
 5.35145266881871
 5.552319796014526
 5.974560835055425
 6.039177477770888

julia> find_zeros(x -> cos(x) + cos(2x), (0, 4pi))    # mix of simple, non-simple zeros
6-element Vector{Float64}:
  1.0471975511965976
  3.141592653589943
  5.235987755982988
  7.330382858376184
  9.424777960769228
 11.519173063162574

julia> f(x) = (x-0.5) * (x-0.5001) * (x-1)          # nearby zeros
f (generic function with 1 method)

julia> find_zeros(f, 0, 2)
3-element Vector{Float64}:
 0.5
 0.5001
 1.0

julia> f(x) = (x-0.5) * (x-0.5001) * (x-4) * (x-4.001) * (x-4.2)
f (generic function with 1 method)

julia> find_zeros(f, 0, 10)
3-element Vector{Float64}:
 0.5
 0.5001
 4.2

julia> f(x) = (x-0.5)^2 * (x-0.5001)^3 * (x-4) * (x-4.001) * (x-4.2)^2  # hard to identify
f (generic function with 1 method)

julia> find_zeros(f, 0, 10, no_pts=21)                # too hard for default
5-element Vector{Float64}:
 0.49999999999999994
 0.5001
 4.0
 4.001
 4.200000000000001

Note

There are two cases where the number of zeros may be underreported:

if the initial interval, (a,b), is too wide
if there are zeros that are very nearby

The basic algorithm checks for zeros among the endpoints, and then divides the interval (a,b) into no_pts-1 subintervals and then proceeds to look for zeros through bisection or a derivative-free method. As checking for a bracketing interval is relatively cheap and bisection is guaranteed to converge, each interval has k pairs of intermediate points checked for a bracket.

If any zeros are found, the algorithm uses these to partition (a,b) into subintervals. Each subinterval is shrunk so that the endpoints are not zeros and the process is repeated on the subinterval. If the initial interval is too large, then the naive scanning for zeros may be fruitless and no zeros will be reported. If there are nearby zeros, the shrinking of the interval may jump over them, though as seen in the examples, nearby roots can be identified correctly, though for really nearby points, or very flat functions, it may help to increase no_pts.

The tolerances are used to shrink the intervals, but not to find zeros within a search. For searches, bisection is guaranteed to converge with no specified tolerance. For the derivative free search, a modification of the Order0 method is used, which at worst case compares |f(x)| <= 8*eps(x) to identify a zero. The algorithm might identify more than one value for a zero, due to floating point approximations. If a potential pair of zeros satisfy isapprox(a,b,atol=sqrt(xatol), rtol=sqrt(xrtol)) then they are consolidated.

The algorithm can make many function calls. When zeros are found in an interval, the naive search is carried out on each subinterval. To cut down on function calls, though with some increased chance of missing some zeros, the adaptive nature can be skipped with the argument naive=true or the number of points stepped down.

The algorithm is derived from one in a PR by @djsegal.

Note

The IntervalRootFinding package provides a rigorous alternative to this heuristic one. That package uses interval arithmetic, so can compute bounds on the size of the image of an interval under f. If this image includes 0, then it can look for the zero. Bisection, on the other hand, only will look for a zero if the two endpoints have different signs, a much more rigid condition for a potential zero.

For example, this function (due to @truculentmath) is particularly tricky, as it is positive at every floating point number, but has two zeros (the vertical asymptote at 15//11 is only negative within adjacent floating point values):

julia> using IntervalArithmetic, IntervalRootFinding, Roots

julia> g(x) = x^2 + 1 +log(abs( 11*x-15 ))/99
g (generic function with 1 method)

julia> find_zeros(g, -3, 3)
Float64[]

julia> IntervalRootFinding.roots(g, -3..3, IntervalRootFinding.Bisection)
1-element Vector{Root{Interval{Float64}}}:
 Root([1.36363, 1.36364], :unknown)

A less extreme usage might be the following, where unique indicates Bisection could be useful and indeed find_zeros will identify these values:

julia> g(x) = exp(x) - x^5
g (generic function with 1 method)

julia> rts = IntervalRootFinding.roots(g, -20..20)
2-element Vector{Root{Interval{Float64}}}:
 Root([12.7132, 12.7133], :unique)
 Root([1.29585, 1.29586], :unique)

julia> find_zeros(g, -20, 20)
2-element Vector{Float64}:
  1.2958555090953687
 12.713206788867632

CommonSolve interface

The problem-algorithm-solve interface is a pattern popularized in Julia by the DifferentialEquations.jl suite of packages. This can be used as an alternative to find_zero. Unlike find_zero, solve will return NaN on non-convergence.

CommonSolve.solve! — Function

solve(fx::ZeroProblem, [M], [N]; p=nothing, kwargs...)
init(fx::ZeroProblem, [M], [N];
     p=nothing,
     verbose=false, tracks=NullTracks(), kwargs...)
solve!(P::ZeroProblemIterator)

Solve for the zero of a scalar-valued univariate function specified through ZeroProblem or ZeroProblemIterator using the CommonSolve interface.

The defaults for M and N depend on the ZeroProblem: if x0 is a number, then M=Secant() and N=AlefeldPotraShi() is used (Order0); if x0 has 2 (or more values) then it is assumed to be a bracketing interval and M=AlefeldPotraShi() is used.

The methods involved with this interface are:

ZeroProblem: used to specify a problem with a function (or functions) and an initial guess
solve: to solve for a zero in a ZeroProblem

The latter calls the following, which can be useful independently:

init: to initialize an iterator with a method for solution, any adjustments to the default tolerances, and a specification to log the steps or not.
solve! to iterate to convergence.

Returns NaN, not an error like find_zero, when the problem can not be solved. Tested for zero allocations.

Examples:

julia> using Roots

julia> fx = ZeroProblem(sin, 3)
ZeroProblem{typeof(sin), Int64}(sin, 3)

julia> solve(fx)
3.141592653589793

Or, if the iterable is required

julia> problem = init(fx);

julia> solve!(problem)
3.141592653589793

keyword arguments can be used to adjust the default tolerances.

julia> solve(fx, Order5(); atol=1/100)
3.1425464815525403

The above is equivalent to:

julia> problem = init(fx, Order5(), atol=1/100);

julia> solve!(problem)
3.1425464815525403

The keyword argument p may be used if the function(s) to be solved depend on a parameter in their second positional argument (e.g., f(x, p)). For example

julia> f(x,p) = exp(-x) - p # to solve p = exp(-x)
f (generic function with 1 method)

julia> fx = ZeroProblem(f, 1)
ZeroProblem{typeof(f), Int64}(f, 1)

julia> solve(fx; p=1/2)  # log(2)
0.6931471805599453

This would be recommended, as there is no recompilation due to the function changing.

The argument verbose=true for init instructs that steps to be logged;

The iterator interface allows for the creation of hybrid solutions, such as is used when two methods are passed to solve. For example, this is essentially how the hybrid default is constructed:

julia> function order0(f, x)
           fx = ZeroProblem(f, x)
           p = init(fx, Roots.Secant())
           xᵢ,st = ϕ = iterate(p)
           while ϕ !== nothing
               xᵢ, st = ϕ
               state, ctr = st
               fᵢ₋₁, fᵢ = state.fxn0, state.fxn1
               if sign(fᵢ₋₁)*sign(fᵢ) < 0 # check for bracket
                   x0 = (state.xn0, state.xn1)
                   fx′ = ZeroProblem(f, x0)
                   p = init(fx′, Bisection())
                   xᵢ = solve!(p)
                   break
               end
               ϕ = iterate(p, st)
           end
           xᵢ
       end
order0 (generic function with 1 method)

julia> order0(sin, 3)
3.141592653589793

Roots.ZeroProblem — Type

ZeroProblem{F,X}

A container for a function and initial guess to be used with solve.

Classical methods based on derivatives

We begin by describing the classical methods even though they are not necessarily recommended because they require more work of the user, as they give insight into why there are a variety of methods available.

The classical methods of Newton and Halley utilize information about the function and its derivative(s) in an iterative manner to converge to a zero of $f(x)$ given an initial starting value.

Newton's method is easily described:

From an initial point, the next point in the iterative algorithm is found by identifying the intersection of the $x$ axis with the tangent line of $f$ at the initial point. This is repeated until convergence or the realization that convergence won't happen for the initial point. Mathematically,

$x_{n+1} = x_{n} - f(x_n)/f'(x_n).$

Some facts are helpful to understand the different methods available in Roots:

For Newton's method there is a formula for the error: Set $\epsilon_n = \alpha - x_n$, where $\alpha$ is the zero, then $\epsilon_{n+1} = -f''(\xi_n)/(2f'(\xi_n) \cdot \epsilon_n^2,$ here $\xi_n$ is some value between $\alpha$ and $x_n$.
The error term, when of the form $|\epsilon_{n+1}| \leq C\cdot|\epsilon_n|^2$, can be used to identify an interval around $\alpha$ for which convergence is guaranteed. Such convergence is termed quadratic (order 2). For floating point solutions, quadratic convergence and a well chosen initial point can lead to convergence in 4 or 5 iterations. In general, convergence is termed order $q$ when $|\epsilon_{n+1}| \approx C\cdot|\epsilon_n|^q$
The term $-f''(\xi_n)/(2f'(\xi_n)$ indicates possible issues when $f''$ is too big near $\alpha$ or $f'$ is too small near $\alpha$. In particular if $f'(\alpha) = 0$, there need not be quadratic convergence, and convergence can take many iterations. A zero for which $f(x) = (x-\alpha)^{1+\beta}\cdot g(x)$, with $g(\alpha) \neq 0$ is called simple when $\beta=0$ and non-simple when $\beta > 0$. Newton's method is quadratic near simple zeros and need not be quadratic near non-simple zeros.

As well, if $f''$ is too big near $\alpha$, or $f'$ too small near $\alpha$, or $x_n$ too far from $\alpha$ (that is, $|\epsilon_n|>1$) the error might actually increase and convergence is not guaranteed.

The explicit form of the error function can be used to guarantee convergence for functions with a certain shape (monotonic, convex functions where the sign of $f''$ and $f'$ don't change). Quadratic convergence may only occur once the algorithm is near the zero.
The number of function evaluations per step for Newton's method is 2.

Roots.Newton — Type

Roots.Newton()

Implements Newton's method: xᵢ₊₁ = xᵢ - f(xᵢ)/f'(xᵢ). This is a quadratically convergent method requiring one derivative. Two function calls per step.

Examples

julia> using Roots

julia> find_zero((sin,cos), 3.0, Roots.Newton()) ≈ π
true

If function evaluations are expensive one can pass in a function which returns (f, f/f') as follows

julia> find_zero(x -> (sin(x), sin(x)/cos(x)), 3.0, Roots.Newton()) ≈ π
true

This can be advantageous if the derivative is easily computed from the value of f, but otherwise would be expensive to compute.

The error, eᵢ = xᵢ - α, can be expressed as eᵢ₊₁ = f[xᵢ,xᵢ,α]/(2f[xᵢ,xᵢ])eᵢ² (Sidi, Unified treatment of regula falsi, Newton-Raphson, secant, and Steffensen methods for nonlinear equations).

Roots.Halley — Type

Roots.Halley()

Implements Halley's method, xᵢ₊₁ = xᵢ - (f/f')(xᵢ) * (1 - (f/f')(xᵢ) * (f''/f')(xᵢ) * 1/2)^(-1) This method is cubically converging, it requires $3$ function calls per step.

The function, its derivative and second derivative can be passed either as a tuple of $3$ functions or as a function returning values for $(f, f/f', f'/f'')$, which could be useful when function evaluations are expensive.

Examples

julia> using Roots

julia> find_zero((sin, cos, x->-sin(x)), 3.0, Roots.Halley()) ≈ π
true

julia> function f(x)
       s,c = sincos(x)
       (s, s/c, -c/s)
       end;

julia> find_zero(f, 3.0, Roots.Halley()) ≈ π
true

This can be advantageous if the derivatives are easily computed from the computation for f, but otherwise would be expensive to compute separately.

The error, eᵢ = xᵢ - α, satisfies eᵢ₊₁ ≈ -(2f'⋅f''' -3⋅(f'')²)/(12⋅(f'')²) ⋅ eᵢ³ (all evaluated at α).

Roots.QuadraticInverse — Type

Roots.QuadraticInverse()

Implements the quadratic inverse method also known as Chebyshev's method, xᵢ₊₁ = xᵢ - (f/f')(xᵢ) * (1 + (f/f')(xᵢ) * (f''/f')(xᵢ) * 1/2). This method is cubically converging, it requires $3$ function calls per step.

Example

julia> using Roots

julia> find_zero((sin, cos, x->-sin(x)), 3.0, Roots.QuadraticInverse()) ≈ π
true

If function evaluations are expensive one can pass in a function which returns (f, f/f',f'/f'') as follows

julia> find_zero(x -> (sin(x), sin(x)/cos(x), -cos(x)/sin(x)), 3.0, Roots.QuadraticInverse()) ≈ π
true

This can be advantageous if the derivatives are easily computed from the computation for f, but otherwise would be expensive to compute separately.

The error, eᵢ = xᵢ - α, satisfies eᵢ₊₁ ≈ (1/2⋅(f''/f')² - 1/6⋅f'''/f')) ⋅ eᵢ³ (all evaluated at α).

Roots.ChebyshevLike — Type

CHEBYSHEV-LIKE METHODS AND QUADRATIC EQUATIONS (J. A. EZQUERRO, J. M. GUTIÉRREZ, M. A. HERNÁNDEZ and M. A. SALANOVA)

Roots.SuperHalley — Type

An acceleration of Newton's method: Super-Halley method (J.M. Gutierrez, M.A. Hernandez

Newton and Halley's method are members of this family of methods:

Roots.LithBoonkkampIJzerman — Type

LithBoonkkampIJzerman{S,D} <: AbstractNewtonLikeMethod
LithBoonkkampIJzerman(S,D)

A family of different methods that includes the secant method and Newton's method.

Specifies a linear multistep solver with S steps and D derivatives following Lith, Boonkkamp, and IJzerman.

Examples

julia> using Roots

julia> find_zero(sin, 3, Roots.LithBoonkkampIJzerman(2,0)) ≈ π # the secant method
true

julia> find_zero((sin,cos), 3, Roots.LithBoonkkampIJzerman(1,1)) ≈ π # Newton's method
true

julia> find_zero((sin,cos), 3, Roots.LithBoonkkampIJzerman(3,1)) ≈ π # Faster convergence rate
true

julia> find_zero((sin,cos, x->-sin(x)), 3, Roots.LithBoonkkampIJzerman(1,2)) ≈ π # Halley-like method
true

The method can be more robust to the intial condition. This example is from the paper (p13). Newton's method (the S=1, D=1 case) fails if |x₀| ≥ 1.089 but methods with more memory succeed.

julia> fx =  ZeroProblem((tanh,x->sech(x)^2), 1.239); # zero at 0.0

julia> solve(fx, Roots.LithBoonkkampIJzerman(1,1)) |> isnan# Newton, NaN
true

julia> solve(fx, Roots.LithBoonkkampIJzerman(2,1)) |> abs |> <(eps())
true

julia> solve(fx, Roots.LithBoonkkampIJzerman(3,1)) |> abs |> <(eps())
true

Multiple derivatives can be constructed automatically using automatic differentiation. For example,

julia> using ForwardDiff

julia> function δ(f, n::Int=1)
           n <= 0 && return f
           n == 1 && return x -> ForwardDiff.derivative(f,float(x))
           δ(δ(f,1),n-1)
       end;

julia> fs(f,n) = ntuple(i -> δ(f,i-1), Val(n+1));

julia> f(x) = cbrt(x) * exp(-x^2); # cf. Table 6 in paper, α = 0

julia> fx = ZeroProblem(fs(f,1), 0.1147);

julia> opts = (xatol=2eps(), xrtol=0.0, atol=0.0, rtol=0.0); # converge if |xₙ - xₙ₋₁| <= 2ϵ

julia> solve(fx, Roots.LithBoonkkampIJzerman(1, 1); opts...) |> isnan # NaN -- no convergence
true

julia> solve(fx, Roots.LithBoonkkampIJzerman(2, 1); opts...) |> abs |> <(eps()) # converges
true

julia> fx = ZeroProblem(fs(f,2), 0.06);                       # need better starting point

julia> solve(fx, Roots.LithBoonkkampIJzerman(2, 2); opts...) |> abs |> <(eps()) # converges
true

For the case D=1, a bracketing method based on this approach is implemented in LithBoonkkampIJzermanBracket

Reference

In Lith, Boonkkamp, and IJzerman an analysis is given of the convergence rates when using linear multistep methods to solve 0=f(x) using f⁻¹(0) = x when f is a sufficiently smooth linear function. The reformulation, attributed to Grau-Sanchez, finds a differential equation for f⁻¹: dx/dy = [f⁻¹]′(y) = 1/f′(x) = F as x(0) = x₀ + ∫⁰_y₀ F(x(y)) dy.

A linear multi-step method is used to solve this equation numerically. Let S be the number of memory steps (S= 1,2,...) and D be the number of derivatives employed, then, with F(x) = dx/dy x_{n+S} = ∑_{k=0}^{S-1} aₖ x_{n+k} +∑d=1^D ∑_{k=1}^{S-1} aᵈ_{n+k}F⁽ᵈ⁾(x_{n+k}). The aₖs and aᵈₖs are computed each step.

This table is from Tables 1 and 3 of the paper and gives the convergence rate for simple roots identified therein:

s: number of steps remembered
d: number of derivatives uses
s/d  0    1    2    3    4
1    .    2    3    4    5
2    1.62 2.73 3.79 4.82 5.85
3    1.84 2.91 3.95 4.97 5.98
4    1.92 2.97 3.99 4.99 5.996
5    1.97 .    .    .    .

That is, more memory leads to a higher convergence rate; more derivatives leads to a higher convergence rate. However, the interval about α, the zero, where the convergence rate is guaranteed may get smaller.

Note

For the larger values of S, the expressions to compute the next value get quite involved. The higher convergence rate is likely only to be of help for finding solutions to high precision.

Derivative free methods

The secant method replaces the derivative term in Newton's method with the slope of the secant line.

$x_{n+1} = x_n - (\frac{f(x_n)-f(x_{n-1})}{x_n - x_{n-1}})^{-1}\cdot f(x_n).$

Though the secant method has convergence rate of order $\approx 1.618$ and is not quadratic, it only requires one new function call per step so can be very effective. Often function evaluations are the slowest part of the computation and, as well, no derivative is needed. Because it can be very efficient, the secant method is used in the default method of find_zero when called with a single initial starting point.

Steffensen's method is a quadratically converging. derivative-free method which uses a secant line based on $x_n$ and $x_n + f(x_n)$. Though of higher order, it requires additional function calls per step and depends on a good initial starting value. Other derivative free methods are available, trading off increased function calls for higher-order convergence. They may be of interest when arbitrary precision is needed. A measure of efficiency is $q^{1/r}$ where $q$ is the order of convergence and $r$ the number of function calls per step. With this measure, the secant method would be $\approx (1.618)^{1/1}$ and Steffensen's would be less ($2^{1/2}$).

Roots.Secant — Type

Secant()
Order1()
Orderφ()

The Order1() method is an alias for Secant. It specifies the secant method. This method keeps two values in its state, xₙ and xₙ₋₁. The updated point is the intersection point of $x$ axis with the secant line formed from the two points. The secant method uses $1$ function evaluation per step and has order φ≈ (1+sqrt(5))/2.

The error, eᵢ = xᵢ - α, satisfies eᵢ₊₂ = f[xᵢ₊₁,xᵢ,α] / f[xᵢ₊₁,xᵢ] * (xᵢ₊₁-α) * (xᵢ - α).

Roots.Steffensen — Type

Steffensen()
Order2()

The quadratically converging Steffensen method is used for the derivative-free Order2() algorithm. Unlike the quadratically converging Newton's method, no derivative is necessary, though like Newton's method, two function calls per step are. Steffensen's algorithm is more sensitive than Newton's method to poor initial guesses when f(x) is large, due to how f'(x) is approximated. The Order2 method replaces a Steffensen step with a secant step when f(x) is large.

The error, eᵢ - α, satisfies eᵢ₊₁ = f[xᵢ, xᵢ+fᵢ, α] / f[xᵢ,xᵢ+fᵢ] ⋅ (1 - f[xᵢ,α] ⋅ eᵢ²

Roots.Order5 — Type

Order5()
KumarSinghAkanksha()

Implements an order 5 algorithm from A New Fifth Order Derivative Free Newton-Type Method for Solving Nonlinear Equations by Manoj Kumar, Akhilesh Kumar Singh, and Akanksha, Appl. Math. Inf. Sci. 9, No. 3, 1507-1513 (2015), DOI: 10.12785/amis/090346. Four function calls per step are needed. The Order5 method replaces a Steffensen step with a secant step when f(x) is large.

The error, eᵢ = xᵢ - α, satisfies eᵢ₊₁ = K₁ ⋅ K₅ ⋅ M ⋅ eᵢ⁵ + O(eᵢ⁶)

Roots.Order8 — Type

Order8()
Thukral8()

Implements an eighth-order algorithm from New Eighth-Order Derivative-Free Methods for Solving Nonlinear Equations by Rajinder Thukral, International Journal of Mathematics and Mathematical Sciences Volume 2012 (2012), Article ID 493456, 12 pages DOI: 10.1155/2012/493456. Four function calls per step are required. The Order8 method replaces a Steffensen step with a secant step when f(x) is large.

The error, eᵢ = xᵢ - α, is expressed as eᵢ₊₁ = K ⋅ eᵢ⁸ in (2.25) of the paper for an explicit K.

Roots.Order16 — Type

Order16()
Thukral16()

Implements the order 16 algorithm from New Sixteenth-Order Derivative-Free Methods for Solving Nonlinear Equations by R. Thukral, American Journal of Computational and Applied Mathematics p-ISSN: 2165-8935; e-ISSN: 2165-8943; 2012; 2(3): 112-118 DOI: 10.5923/j.ajcam.20120203.08.

Five function calls per step are required. Though rapidly converging, this method generally isn't faster (fewer function calls/steps) over other methods when using Float64 values, but may be useful for solving over BigFloat. The Order16 method replaces a Steffensen step with a secant step when f(x) is large.

The error, eᵢ = xᵢ - α, is expressed as eᵢ₊₁ = K⋅eᵢ¹⁶ for an explicit K in equation (50) of the paper.

Bracketing methods

The bisection identifies a zero of a continuous function between $a$ and $b$ when $f(a)$ and $f(b)$ have different signs. (The interval $[a,b]$ is called a bracketing interval when $f(a)\cdot f(b) <0$.) The basic algorithm is particularly simple, an interval $[a_i,b_i]$ is split at $c = (a_i+b_i)/2$. Either $f(c)=0$, or one of $[a_i,c]$ or $[c,b_i]$ is a bracketing interval, which is called $[a_{i+1},b_{i+1}]$. From this description, we see that $[a_i,b_i]$ has length $2^{-i}$ times the length of $[a_0,b_0]$, so the intervals will eventually terminate by finding a zero, $c$, or converge to a zero. This convergence is slow (the efficiency is only $1$, but guaranteed. For 16-, 32-, and 64-bit floating point values, a reinterpretation of how the midpoint ($c$) is found leads to convergence in no more than $64$ iterations, unlike the midpoint found above, where some cases can take many more steps to converge.

In floating point, by guaranteed convergence we have either an exact zero or a bracketing interval consisting of two adjacent floating point values. When applied to non-continuous functions, this algorithm will identify an exact zero or a zero crossing of the function. (E.g., applied to $f(x)=1/x$ it will find $0$.)

The default selection of midpoint described above includes no information about the function $f$ bounds its sign. Algorithms exploiting the shape of the function can be significantly more efficient. For example, the bracketing method Roots.AlefeldPotraShi due to Alefeld, Potra, and Shi has efficiency $\approx 1.6686$. This method is also used in the default method for find_zero when a single initial starting point is given if a bracketing interval is identified.

Roots.Bisection — Type

Bisection()

If possible, will use the bisection method over Float64 values. The bisection method starts with a bracketing interval [a,b] and splits it into two intervals [a,c] and [c,b], If c is not a zero, then one of these two will be a bracketing interval and the process continues. The computation of c is done by _middle, which reinterprets floating point values as unsigned integers and splits there. It was contributed by Jason Merrill. This method avoids floating point issues and when the tolerances are set to zero (the default) guarantees a "best" solution (one where a zero is found or the bracketing interval is of the type [a, nextfloat(a)]).

When tolerances are given, this algorithm terminates when the interval length is less than or equal to the tolerance max(δₐ, 2abs(u)δᵣ) with u in {a,b} chosen by the smaller of |f(a)| and |f(b)|, or or the function value is less than max(tol, min(abs(a), abs(b)) * rtol). The latter is used only if the default tolerances (atol or rtol) are adjusted.

Roots.A42 — Type

Roots.A42()

Bracketing method which finds the root of a continuous function within a provided interval [a, b], without requiring derivatives. It is based on algorithm 4.2 described in: G. E. Alefeld, F. A. Potra, and Y. Shi, "Algorithm 748: enclosing zeros of continuous functions," ACM Trans. Math. Softw. 21, 327–344 (1995), DOI: 10.1145/210089.210111. The asymptotic efficiency index, $q^{1/k}$, is $(2 + 7^{1/2})^{1/3} = 1.6686...$.

Originally by John Travers.

Roots.AlefeldPotraShi — Type

Roots.AlefeldPotraShi()

Follows algorithm in "ON ENCLOSING SIMPLE ROOTS OF NONLINEAR EQUATIONS", by Alefeld, Potra, Shi; DOI: 10.1090/S0025-5718-1993-1192965-2.

The order of convergence is 2 + √5; asymptotically there are 3 function evaluations per step. Asymptotic efficiency index is $(2+√5)^(1/3) ≈ 1.618...$. Less efficient, but can run faster than the A42 method.

Originally by John Travers.

Roots.Brent — Type

Roots.Brent()

An implementation of Brent's (or Brent-Dekker) method. This method uses a choice of inverse quadratic interpolation or a secant step, falling back on bisection if necessary.

Roots.Chandrapatla — Type

Roots.Chandrapatla()

Use Chandrapatla's algorithm (cf. Scherer to solve $f(x) = 0$.

Chandrapatla's algorithm chooses between an inverse quadratic step or a bisection step based on a computed inequality.

Roots.Ridders — Type

Roots.Ridders()

Implements Ridders' method. This bracketing method finds the midpoint, x₁; then interpolates an exponential; then uses false position with the interpolated value to find c. If c and x₁ form a bracket is used, otherwise the subinterval [a,c] or [c,b] is used.

Example:

julia> using Roots

julia> find_zero(x -> exp(x) - x^4, (5, 15), Roots.Ridders()) ≈ 8.61316945644
true

julia> find_zero(x -> x*exp(x) - 10, (-100, 100), Roots.Ridders()) ≈ 1.74552800274
true

julia> find_zero(x -> tan(x)^tan(x) - 1e3, (0, 1.5), Roots.Ridders()) ≈ 1.3547104419
true

Ridders showed the error satisfies eₙ₊₁ ≈ 1/2 eₙeₙ₋₁eₙ₋₂ ⋅ (g^2-2fh)/f for f=F', g=F''/2, h=F'''/6, suggesting converence at rate ≈ 1.839.... It uses two function evaluations per step, so its order of convergence is ≈ 1.225....

Roots.ITP — Type

Roots.ITP(;[κ₁-0.2, κ₂=2, n₀=1])

Use the ITP bracketing method. This method claims it "is the first root-finding algorithm that achieves the superlinear convergence of the secant method while retaining the optimal worst-case performance of the bisection method."

The values κ1, κ₂, and n₀ are tuning parameters.

The suggested value of κ₁ is 0.2/(b-a), but the default here is 0.2. The value of κ₂ is 2, and the default value of n₀ is 1.

Note:

Suggested on discourse by @TheLateKronos, who supplied the original version of the code below.

Roots.FalsePosition — Type

FalsePosition([galadino_factor])

Use the false position method to find a zero for the function f within the bracketing interval [a,b].

The false position method is a modified bisection method, where the midpoint between [aₖ, bₖ] is chosen to be the intersection point of the secant line with the $x$ axis, and not the average between the two values.

To speed up convergence for concave functions, this algorithm implements the $12$ reduction factors of Galdino (A family of regula falsi root-finding methods). These are specified by number, as in FalsePosition(2) or by one of three names FalsePosition(:pegasus), FalsePosition(:illinois), or FalsePosition(:anderson_bjork) (the default). The default choice has generally better performance than the others, though there are exceptions.

For some problems, the number of function calls can be greater than for the Bisection method, but generally this algorithm will make fewer function calls.

Examples

find_zero(x -> x^5 - x - 1, (-2, 2), FalsePosition())

Roots.LithBoonkkampIJzermanBracket — Type

LithBoonkkampIJzermanBracket()

A bracketing method which is a modification of Brent's method due to Lith, Boonkkamp, and IJzerman. The best possible convergence rate is 2.91.

A function, its derivative, and a bracketing interval need to be specified.

The state includes the 3 points – a bracket [a,b] (b=xₙ has f(b) closest to 0) and c=xₙ₋₁ – and the corresponding values for the function and its derivative at these three points.

The next proposed step is either a S=2 or S=3 selection for the LithBoonkkampIJzerman methods with derivative information included only if it would be of help. The proposed is modified if it is dithering. The proposed is compared against a bisection step; the one in the bracket and with the smaller function value is chosen as the next step.

Non-simple zeros

The order of convergence for most methods is for simple zeros, values $\alpha$ where $f(x) = (x-\alpha) \cdot g(x)$, with $g(\alpha)$ being non-zero. For methods which are of order $k$ for non-simple zeros, usually an additional function call is needed per step. For example, this is the case for Roots.Newton as compared to Roots.Schroder.

Derivative-free methods for non-simple zeros have the following implemented:

Roots.King — Type

Roots.Order1B()
Roots.King()

A superlinear (order 1.6...) modification of the secant method for multiple roots. Presented in A SECANT METHOD FOR MULTIPLE ROOTS, by RICHARD F. KING, BIT 17 (1977), 321-328

The basic idea is similar to Schroder's method: apply the secant method to f/f'. However, this uses f' ~ fp = (fx - f(x-fx))/fx (a Steffensen step). In this implementation, Order1B, when fx is too big, a single secant step of f is used.

The asymptotic error, eᵢ = xᵢ - α, is given by eᵢ₊₂ = 1/2⋅G''/G'⋅ eᵢ⋅eᵢ₊₁ + (1/6⋅G'''/G' - (1/2⋅G''/G'))^2⋅eᵢ⋅eᵢ₊₁⋅(eᵢ+eᵢ₊₁).

Roots.Esser — Type

Roots.Order2B()
Roots.Esser()

Esser's method. This is a quadratically convergent method that, like Schroder's method, does not depend on the multiplicity of the zero. Schroder's method has update step x - r2/(r2-r1) * r1, where ri = fⁱ⁻¹/fⁱ. Esser approximates f' ~ f[x-h, x+h], f'' ~ f[x-h,x,x+h], where h = fx, as with Steffensen's method, Requiring 3 function calls per step. The implementation Order2B uses a secant step when |fx| is considered too large.

Esser, H. Computing (1975) 14: 367. DOI: 10.1007/BF02253547 Eine stets quadratisch konvergente Modifikation des Steffensen-Verfahrens

Examples

f(x) = cos(x) - x
g(x) = f(x)^2
x0 = pi/4
find_zero(f, x0, Order2(), verbose=true)        #  3 steps / 7 function calls
find_zero(f, x0, Roots.Order2B(), verbose=true) #  4 / 9
find_zero(g, x0, Order2(), verbose=true)        #  22 / 45
find_zero(g, x0, Roots.Order2B(), verbose=true) #  4 / 10

For non-simple zeros, Schroder showed an additional derivative can be used to yield quadratic convergence based on Newton's method:

Roots.Schroder — Type

Roots.Schroder()

Schröder's method, like Halley's method, utilizes f, f', and f''. Unlike Halley it is quadratically converging, but this is independent of the multiplicity of the zero (cf. Schröder, E. "Über unendlich viele Algorithmen zur Auflösung der Gleichungen." Math. Ann. 2, 317-365, 1870; mathworld).

Schröder's method applies Newton's method to f/f', a function with all simple zeros.

Example

m = 2
f(x) = (cos(x)-x)^m
fp(x) = (-x + cos(x))*(-2*sin(x) - 2)
fpp(x) = 2*((x - cos(x))*cos(x) + (sin(x) + 1)^2)
find_zero((f, fp, fpp), pi/4, Roots.Halley())     # 14 steps
find_zero((f, fp, fpp), 1.0, Roots.Schroder())    # 3 steps

(Whereas, when m=1, Halley is 2 steps to Schröder's 3.)

If function evaluations are expensive one can pass in a function which returns (f, f/f',f'/f'') as follows

find_zero(x -> (sin(x), sin(x)/cos(x), -cos(x)/sin(x)), 3.0, Roots.Schroder())

This can be advantageous if the derivatives are easily computed from the value of f, but otherwise would be expensive to compute.

The error, eᵢ = xᵢ - α, is the same as Newton with f replaced by f/f'.

A family of methods for non-simple zeros which require $k$ derivatives to be order $k$, with $k=2$ yielding Schroder's method, are implemented in:

Roots.AbstractThukralBMethod — Type

AbstractThukralBMethod

Abstract type for ThukralXB methods for X being 2,3,4, or 5.

These are a family of methods which are

efficient (order X) for non-simple roots (e.g. Thukral2B is the Schroder method)
take X+1 function calls per step
require X derivatives. These can be passed as a tuple of functions, (f, f', f'', …), or as

a function returning the ratios: x -> (f(x), f(x)/f'(x), f'(x)/f''(x), …).

Examples

using ForwardDiff
Base.adjoint(f::Function)  = x  -> ForwardDiff.derivative(f, float(x))
f(x) = (exp(x) + x - 2)^6
x0 = 1/4
find_zero((f, f', f''), x0, Roots.Halley())               # 14 iterations; ≈ 48 function evaluations
find_zero((f, f', f''), big(x0), Roots.Thukral2B())       #  3 iterations; ≈ 9 function evaluations
find_zero((f, f', f'', f'''), big(x0), Roots.Thukral3B()) #  2 iterations; ≈ 8 function evaluations

Refrence

Introduction to a family of Thukral $k$-order method for finding multiple zeros of nonlinear equations, R. Thukral, JOURNAL OF ADVANCES IN MATHEMATICS 13(3):7230-7237, DOI: 10.24297/jam.v13i3.6146.

Hybrid methods

A useful strategy is to begin with a non-bracketing method and switch to a bracketing method should a bracket be encountered. This allows for the identification of zeros which are not surrounded by a bracket, and have guaranteed convergence should a bracket be encountered. It is used by default by find_zero(f,a).

Roots.Order0 — Type

Order0()

The Order0 method is engineered to be a more robust, though possibly slower, alternative to the other derivative-free root-finding methods. The implementation roughly follows the algorithm described in Personal Calculator Has Key to Solve Any Equation $f(x) = 0$, the SOLVE button from the HP-34C. The basic idea is to use a secant step. If along the way a bracket is found, switch to a bracketing algorithm, using AlefeldPotraShi. If the secant step fails to decrease the function value, a quadratic step is used up to $4$ times.

This is not really $0$-order: the secant method has order $1.6...$ Wikipedia and the the bracketing method has order $1.6180...$ Wikipedia so for reasonable starting points and functions, this algorithm should be superlinear, and relatively robust to non-reasonable starting points.

Rates of convergence

The order of a method is $q$, where $e_{i+1} \approx e_i^q$. Newton's method is famously quadratic for simple roots; the secant method of order $q \approx \varphi=1.618\dots$. However, $p=2$ calls are needed for Newton's method, and only $p=1$ for the secant method. The asymptotic efficiency is $q^{1/p}$, which penalizes function calls. There are other order $k$ methods taking $k$ function calls per step, e.g., Halley's; others take fewer, as seen below. Many use inverse quadratic steps, others inverse cubic–these have order $q$ solving $q^{s+1}-2q^s+1$ ($s=3$ for quadratic). For robust methods, generally $1$ additional function call is needed to achieve the convergence rate, Schroder being a good example.

Type	Method	Order	F evals	Asymptotic efficiency
Hybrid	Order0			$\approx 1.618\dots$
Derivative Free	Secant	$\varphi=1.618\dots$	$1$	$1.618\dots$
Derivative Free	Steffensen	$2$	$2$	$1.414\dots$
Derivative Free	Order5	$5$	$4$	$1.495\dots$
Derivative Free	Order8	$8$	$4$	$1.681\dots$
Derivative Free	Order16	$16$	$5$	$1.718\dots$
Classical	Newton	$2$	$2$	$1.414\dots$
Classical	Halley	$3$	$3$	$1.442\dots$
Classical	QuadraticInverse	$3$	$3$	$1.442\dots$
Classical	ChebyshevLike	$3$	$3$	$1.442\dots$
Classical	SuperHalley	$3$	$3$	$1.442\dots$
MultiStep	LithBoonkkampIJzerman{S,D}	$p^s=\sum p^k(d+\sigma_k)$	$D+1$	varies, $1.92\dots$ max
Bracketing	BisectionExact	$1$	$1$	$1$
Bracketing	A42	$(2 + 7^{1/2})$	$3,4$	$(2 + 7^{1/2})^{1/3} = 1.6686\dots$
Bracketing	AlefeldPotraShi		$3,4$	$1.618\dots$
Bracketing	Brent	$\leq 1.89\dots$	$1$	$\leq 1.89\dots$
Bracketing	ITP	``\leq \varphi	$1$	$\leq \varphi$
Bracketing	Ridders	$1.83\dots$	$2$	$1.225\dots$
Bracketing	FalsePosition	$1.442\dots$	$1$	$1.442\dots$
Bracketing	LithBoonkkampIJzermanBracket	$2.91$	$3$	$1.427\dots$
Robust	King	$\varphi=1.618\dots$	$2$	$1.272\dots$
Robust	Esser	$2$	$3$	$1.259\dots$
Robust	Schroder	$2$	$3$	$1.259\dots$
Robust	Thukral3	$3$	$4$	$1.316\dots$
Robust	Thukral4	$4$	$5$	$1.319\dots$
Robust	Thukral5	$5$	$6$	$1.307\dots$

Convergence

Identifying when an algorithm converges or diverges requires specifications of tolerances and convergence criteria.

In the case of exact bisection, convergence is mathematically guaranteed. For floating point numbers, either an exact zero is found, or the bracketing interval can be subdivided into $[a_n,b_n]$ with $a_n$ and $b_n$ being adjacent floating point values. That is $b_n-a_n$ is as small as possible in floating point numbers. This can be considered a stopping criteria in $\Delta x$. For early termination (less precision but fewer function calls) a tolerance can be given so that if $\Delta_n=b_n-a_n$ is small enough the algorithm stops successfully. In floating point, assessing if $b_n \approx a_n$ requires two tolerances: a relative tolerance, as the minimal differences in floating point values depend on the size of $b_n$ and $a_n$, and an absolute tolerance for values near $0$. The values xrtol and xatol are passed to the Base.isapprox function to determine closeness.

Relying on the closeness of two $x$ values will not be adequate for all problems, as there are examples where the difference $\Delta_n=|x_n-x_{n-1}|$ can be quite small, $0$ even, yet $f(x_n)$ is not near a $0$. As such, for non-bracketing methods, a check on the size of $f(x_n)$ is also used. As we find floating point approximations to $\alpha$, the zero, we must consider values small when $f(\alpha(1+\epsilon))$ is small. By Taylor's approximation, we can expect this to be around $\alpha\cdot \epsilon \cdot f'(\alpha)$. That is, small depends on the size of $\alpha$ and the derivative at $\alpha$. The former is handled by both relative and absolute tolerances (rtol and atol). The size of $f'(\alpha)$ is problem dependent, and can be accommodated by larger relative or absolute tolerances.

When an algorithm returns a NaN value, it terminates. This can happen near convergence or may indicate some issues. Early termination is checked for convergence in the size of $f(x_n)$ with a relaxed tolerance when strict=false is specified (the default).

Relative tolerances and assessing `f(x) ≈ 0`

The use of relative tolerances to check if $f(x) \approx 0$ can lead to spurious answers where $x$ is very large (and hence the relative tolerance is large). The return of very large solutions should be checked against expectations of the answer.

Deciding if an algorithm won't terminate is done through counting the number or iterations performed; the default adjusted through maxiters. As most algorithms are superlinear, convergence happens rapidly near the answer, but all the algorithms can take a while to get near an answer, even when progress is made. As such, the maximum must be large enough to consider linear cases, yet small enough to avoid too many steps when an algorithm is non-convergent.

Convergence criteria are method dependent and are determined by the Roots.assess_convergence methods.

Roots.assess_convergence — Function

Roots.assess_convergence(method, state, options)

Assess if algorithm has converged.

Return a convergence flag and a Boolean indicating if algorithm has terminated (converged or not converged)

If algrithm hasn't converged this returns (:not_converged, false).

If algorithm has stopped or converged, return flag and true. Flags are:

:x_converged if xn1 ≈ xn, typically with non-zero tolerances specified.
:f_converged if |f(xn1)| < max(atol, |xn1|*rtol)
:nan or :inf if fxn1 is NaN or an infinity.
:not_converged if algorithm should continue

Does not check number of steps taken nor number of function evaluations.

In decide_convergence, stopped values (and :x_converged when strict=false) are checked for convergence with a relaxed tolerance.

Default tolerances are specified through the Roots.default_tolerances methods.

Roots.default_tolerances — Function

default_tolerances(M::AbstractUnivariateZeroMethod, [T], [S])

The default tolerances for most methods are xatol=eps(T), xrtol=eps(T), atol=4eps(S), and rtol=4eps(S), with the proper units (absolute tolerances have the units of x and f(x); relative tolerances are unitless). For Complex{T} values, T is used.

The number of iterations is limited by maxiters=40.

default_tolerances(M::AbstractBisectionMethod, [T], [S])

For Bisection when the x values are of type Float64, Float32, or Float16, the default tolerances are zero and there is no limit on the number of iterations. In this case, the algorithm is guaranteed to converge to an exact zero, or a point where the function changes sign at one of the answer's adjacent floating point values.

For other types, default non-zero tolerances for xatol and xrtol are given.

Simplified versions

The abstractions and many checks for convergence employed by find_zero have a performance cost. When that is a critical concern, there are several "simple" methods provided which can offer improved performance.

Roots.secant_method — Function

secant_method(f, xs; [atol=0.0, rtol=8eps(), maxevals=1000])

Perform secant method to solve f(x) = 0.

The secant method is an iterative method with update step given by b - fb/m where m is the slope of the secant line between (a,fa) and (b,fb).

The inital values can be specified as a pair of 2, as in (x₀, x₁) or [x₀, x₁], or as a single value, x₁ in which case a value of x₀ is chosen.

The algorithm returns m when abs(fm) <= max(atol, abs(m) * rtol). If this doesn't occur before maxevals steps or the algorithm encounters an issue, a value of NaN is returned. If too many steps are taken, the current value is checked to see if there is a sign change for neighboring floating point values.

The Order1 method for find_zero also implements the secant method. This one should be slightly faster, as there are fewer setup costs.

Examples:

Roots.secant_method(sin, (3,4))
Roots.secant_method(x -> x^5 -x - 1, 1.1)

Specialization

This function will specialize on the function f, so that the inital call can take more time than a call to the Order1() method, though subsequent calls will be much faster. Using FunctionWrappers.jl can ensure that the initial call is also equally as fast as subsequent ones.

Roots.bisection — Function

bisection(f, a, b; [xatol, xrtol])

Performs bisection method to find a zero of a continuous function.

It is assumed that (a,b) is a bracket, that is, the function has different signs at a and b. The interval (a,b) is converted to floating point and shrunk when a or b is infinite. The function f may be infinite for the typical case. If f is not continuous, the algorithm may find jumping points over the x axis, not just zeros.

If non-trivial tolerances are specified, the process will terminate when the bracket (a,b) satisfies isapprox(a, b, atol=xatol, rtol=xrtol). For zero tolerances, the default, for Float64, Float32, or Float16 values, the process will terminate at a value x with f(x)=0 or f(x)*f(prevfloat(x)) < 0 or f(x) * f(nextfloat(x)) < 0. For other number types, the Roots.A42 method is used.

Roots.muller — Function

muller(f, xᵢ; xatol=nothing, xrtol=nothing, maxevals=100)
muller(f, xᵢ₋₂, xᵢ₋₁, xᵢ; xatol=nothing, xrtol=nothing, maxevals=100)

Muller’s method generalizes the secant method, but uses quadratic interpolation among three points instead of linear interpolation between two. Solving for the zeros of the quadratic allows the method to find complex pairs of roots. Given three previous guesses for the root xᵢ₋₂, xᵢ₋₁, xᵢ, and the values of the polynomial f at those points, the next approximation xᵢ₊₁ is produced.

Excerpt and the algorithm taken from

W.H. Press, S.A. Teukolsky, W.T. Vetterling and B.P. Flannery Numerical Recipes in C, Cambridge University Press (2002), p. 371

Convergence here is decided by xᵢ₊₁ ≈ xᵢ using the tolerances specified, which both default to eps(one(typeof(abs(xᵢ))))^4/5 in the appropriate units. Each iteration performs three evaluations of f. The first method picks two remaining points at random in relative proximity of xᵢ.

Note that the method may return complex result even for real intial values as this depends on the function.

Examples:

muller(x->x^3-1, 0.5, 0.5im, -0.5) # → -0.500 + 0.866…im
muller(x->x^2+2, 0.0, 0.5, 1.0) # → ≈ 0.00 - 1.41…im
muller(x->(x-5)*x*(x+5), rand(3)...) # → ≈ 0.00
muller(x->x^3-1, 1.5, 1.0, 2.0) # → 2.0, Not converged

Roots.newton — Function

Roots.newton(f, fp, x0; kwargs...)

Implementation of Newton's method: xᵢ₊₁ = xᵢ - f(xᵢ)/f'(xᵢ).

Arguments:

f::Function – function to find zero of
fp::Function – the derivative of f.
x0::Number – initial guess. For Newton's method this may be complex.

With the FowardDiff package derivatives may be computed automatically. For example, defining D(f) = x -> ForwardDiff.derivative(f, float(x)) allows D(f) to be used for the first derivative.

Keyword arguments are passed to find_zero using the Roots.Newton() method.

See also Roots.newton((f,fp), x0) and Roots.newton(fΔf, x0) for simpler implementations.

Roots.dfree — Function

dfree(f, xs)

A more robust secant method implementation

Solve for f(x) = 0 using an alogorithm from Personal Calculator Has Key to Solve Any Equation f(x) = 0, the SOLVE button from the HP-34C.

This is also implemented as the Order0 method for find_zero.

The inital values can be specified as a pair of two values, as in (a,b) or [a,b], or as a single value, in which case a value of b is computed, possibly from fb. The basic idea is to follow the secant method to convergence unless:

a bracket is found, in which case AlefeldPotraShi is used;
the secant method is not converging, in which case a few steps of a quadratic method are used to see if that improves matters.

Convergence occurs when f(m) == 0, there is a sign change between m and an adjacent floating point value, or f(m) <= 2^3*eps(m).

A value of NaN is returned if the algorithm takes too many steps before identifying a zero.

Examples

Roots.dfree(x -> x^5 - x - 1, 1.0)

MATLAB interface

The initial naming scheme used fzero instead of fzeros, following the name of the MATLAB function fzero. This interface is not recommended, but, for now, still maintained.

Roots.fzero — Function

fzero(f, x0; order=0; kwargs...)
fzero(f, x0, M; kwargs...)
fzero(f, x0, M, N; kwargs...)
fzero(f, x0; kwargs...)
fzero(f, a::Number, b::Number; kwargs...)
fzero(f, a::Number, b::Number; order=?, kwargs...)
fzero(f, fp, a::Number; kwargs...)

Find zero of a function using one of several iterative algorithms.

f: a scalar function or callable object
x0: an initial guess, a scalar value or tuple of two values
order: An integer, symbol, or string indicating the algorithm to use for find_zero. The Order0 default may be specified directly by order=0, order=:0, or order="0"; Order1() by order=1, order=:1, order="1", or order=:secant; Order1B() by order="1B", etc.
M: a specific method, as would be passed to find_zero, bypassing the use of the order keyword
N: a specific bracketing method. When given, if a bracket is identified, method N will be used to finish instead of method M.
a, b: When two values are passed along, if no order value is specified, Bisection will be used over the bracketing interval (a,b). If an order value is specified, the value of x0 will be set to (a,b) and the specified method will be used.
fp: when fp is specified (assumed to compute the derivative of f), Newton's method will be used
kwargs...: See find_zero for the specification of tolerances and other keyword arguments

Examples:

fzero(sin, 3)                  # use Order0() method, the default
fzero(sin, 3, order=:secant)   # use secant method (also just `order=1`)
fzero(sin, 3, Roots.Order1B()) # use secant method variant for multiple roots.
fzero(sin, 3, 4)               # use bisection method over (3,4)
fzero(sin, 3, 4, xatol=1e-6)   # use bisection method until |x_n - x_{n-1}| <= 1e-6
fzero(sin, 3, 3.1, order=1)    # use secant method with x_0=3.0, x_1 = 3.1
fzero(sin, (3, 3.1), order=2)  # use Steffensen's method with x_0=3.0, x_1 = 3.1
fzero(sin, cos, 3)             # use Newton's method

Note

Unlike find_zero, fzero does not specialize on the type of the function argument. This has the advantage of making the first use of the function f faster, but subsequent uses slower.

Tracking iterations

It is possible to add the keyword arguement verbose=true to when calling the find_zero function to get detailed information about the solution, and data from each iteration. If you want to save this data instead of just printing it, you can use a Tracks object.

Roots.Tracks — Type

Tracks(T, S)
Tracks()

Construct a Tracks object used to record the progress of the algorithm. T is the type of function inputs, and S is the type of function outputs. They both default to Float64. Note that because this type is not exported, you have to write Roots.Tracks() to construct a Tracks object.

By default, no tracking is done while finding a root. To change this, construct a Tracks object, and pass it to the keyword argument tracks. This will modify the Tracks object, storing the input and function values at each iteration, along with additional information about the root-finding process.

Tracks objects are shown in an easy-to-read format. Internally either a tuple of (x,f(x)) pairs or (aₙ, bₙ) pairs are stored, the latter for bracketing methods. (These implementation details may change without notice.) The methods empty!, to reset the Tracks object; get, to get the tracks; last, to get the value convered to, may be of interest.

If you only want to print the information, but you don't need it later, this can conveniently be done by passing verbose=true to the root-finding function. This will not effect the return value, which will still be the root of the function.

Examples

julia> using Roots

julia> f(x) = x^2-2
f (generic function with 1 method)

julia> tracker = Roots.Tracks()
Algorithm has not been run

julia> find_zero(f, (0, 2), Roots.Secant(), tracks=tracker) ≈ √2
true

julia> tracker
Results of univariate zero finding:

* Converged to: 1.4142135623730947
* Algorithm: Secant()
* iterations: 7
* function evaluations ≈ 9
* stopped as |f(x_n)| ≤ max(δ, |x|⋅ϵ) using δ = atol, ϵ = rtol

Trace:
x₁ = 0,	 fx₁ = -2
x₂ = 2,	 fx₂ = 2
x₃ = 1,	 fx₃ = -1
x₄ = 1.3333333333333333,	 fx₄ = -0.22222222222222232
x₅ = 1.4285714285714286,	 fx₅ = 0.04081632653061229
x₆ = 1.4137931034482758,	 fx₆ = -0.0011890606420930094
x₇ = 1.4142114384748701,	 fx₇ = -6.0072868388605372e-06
x₈ = 1.4142135626888697,	 fx₈ = 8.9314555751229818e-10
x₉ = 1.4142135623730947,	 fx₉ = -8.8817841970012523e-16

julia> empty!(tracker)  # resets

julia> find_zero(sin, (3, 4), Roots.A42(), tracks=tracker) ≈ π
true

julia> get(tracker)
4-element Vector{NamedTuple{names, Tuple{Float64, Float64}} where names}:
 (a = 3.0, b = 4.0)
 (a = 3.0, b = 3.5)
 (a = 3.14156188905231, b = 3.1416247553172956)
 (a = 3.141592653589793, b = 3.1415926535897936)

julia> last(tracker)
3.141592653589793

Note

As designed, the Tracks object may not record actions taken while the state object is initialized. An example is the default bisection algorithm where an initial midpoint is found to ensure the bracket does not straddle $0$.