User guide

The list of functions on this page is the officially supported differentiation interface in AbstractDifferentiation.

Loading AbstractDifferentiation

To load AbstractDifferentiation, it is recommended to use

import AbstractDifferentiation as AD

With the AD alias you can access names inside of AbstractDifferentiation using AD.<> instead of typing the long name AbstractDifferentiation.

AbstractDifferentiation backends

To use AbstractDifferentiation, first construct a backend instance ab::AD.AbstractBackend using your favorite differentiation package in Julia that supports AbstractDifferentiation.

Here's an example:

julia> import AbstractDifferentiation as AD, Zygote

julia> backend = AD.ZygoteBackend();

julia> f(x) = log(sum(exp, x));

julia> AD.gradient(backend, f, collect(1:3))
([0.09003057317038046, 0.2447284710547977, 0.665240955774822],)

The following backends are temporarily made available by AbstractDifferentiation as soon as their corresponding package is loaded (thanks to weak dependencies on Julia ≥ 1.9 and Requires.jl on older Julia versions):

AbstractDifferentiation.ReverseRuleConfigBackendType
ReverseRuleConfigBackend

AD backend that uses reverse mode with any ChainRulesCore.jl-compatible reverse-mode AD package.

Constructed with a RuleConfig object:

backend = AD.ReverseRuleConfigBackend(rc)
Note

On Julia >= 1.9, you have to load ChainRulesCore (possibly implicitly by loading a ChainRules-compatible AD package) to be able to use this backend.

In the long term, these backend objects (and many more) will be defined within their respective packages to enforce the AbstractDifferentiation interface. This is already the case for:

  • Diffractor.DiffractorForwardBackend() for Diffractor.jl in forward mode

For higher order derivatives, you can build higher order backends using AD.HigherOrderBackend.

AbstractDifferentiation.HigherOrderBackendType
AD.HigherOrderBackend{B}

Let ab_f be a forward-mode automatic differentiation backend and let ab_r be a reverse-mode automatic differentiation backend. To construct a higher order backend for doing forward-over-reverse-mode automatic differentiation, use AD.HigherOrderBackend((ab_f, ab_r)). To construct a higher order backend for doing reverse-over-forward-mode automatic differentiation, use AD.HigherOrderBackend((ab_r, ab_f)).

Fields

  • backends::B

Derivatives

The following list of functions can be used to request the derivative, gradient, Jacobian, second derivative or Hessian without the function value.

AbstractDifferentiation.derivativeFunction
AD.derivative(ab::AD.AbstractBackend, f, xs::Number...)

Compute the derivatives of f with respect to the numbers xs using the backend ab.

The function returns a Tuple of derivatives, one for each element in xs.

AbstractDifferentiation.gradientFunction
AD.gradient(ab::AD.AbstractBackend, f, xs...)

Compute the gradients of f with respect to the inputs xs using the backend ab.

The function returns a Tuple of gradients, one for each element in xs.

AbstractDifferentiation.jacobianFunction
AD.jacobian(ab::AD.AbstractBackend, f, xs...)

Compute the Jacobians of f with respect to the inputs xs using the backend ab.

The function returns a Tuple of Jacobians, one for each element in xs.

AbstractDifferentiation.second_derivativeFunction
AD.second_derivative(ab::AD.AbstractBackend, f, x)

Compute the second derivative of f with respect to the input x using the backend ab.

The function returns a single value because second_derivative currently only supports a single input.

AbstractDifferentiation.hessianFunction
AD.hessian(ab::AD.AbstractBackend, f, x)

Compute the Hessian of f wrt the input x using the backend ab.

The function returns a single matrix because hessian currently only supports a single input.

Value and derivatives

The following list of functions can be used to request the function value along with its derivative, gradient, Jacobian, second derivative, or Hessian. You can also request the function value, its derivative (or its gradient) and its second derivative (or Hessian) for single-input functions.

AbstractDifferentiation.value_derivative_and_second_derivativeFunction
AD.value_derivative_and_second_derivative(ab::AD.AbstractBackend, f, x)

Return the tuple (v, d, d2) of the function value v = f(x), the first derivative d = AD.derivative(ab, f, x), and the second derivative d2 = AD.second_derivative(ab, f, x).

Jacobian-vector products

This operation goes by a few names, like "pushforward". Refer to the ChainRules documentation for more on terminology. For a single input, single output function f with a Jacobian J, the pushforward operator pf_f is equivalent to applying the function v -> J * v on a (tangent) vector v.

The following functions can be used to request a function that returns the pushforward operator/function. In order to request the pushforward function pf_f of a function f at the inputs xs, you can use either of:

AbstractDifferentiation.pushforward_functionFunction
AD.pushforward_function(ab::AD.AbstractBackend, f, xs...)

Return the pushforward function pff of the function f at the inputs xs using backend ab.

The pushfoward function pff accepts as input a Tuple of tangents, one for each element in xs. If xs consists of a single element, pff can also accept a single tangent instead of a 1-tuple.

AbstractDifferentiation.value_and_pushforward_functionFunction
AD.value_and_pushforward_function(ab::AD.AbstractBackend, f, xs...)

Return a single function vpff which, given tangents ts, computes the tuple (v, p) = vpff(ts) composed of

  • the function value v = f(xs...)
  • the pushforward value p = pff(ts) given by the pushforward function pff = AD.pushforward_function(ab, f, xs...) applied to ts.

See also AbstractDifferentiation.pushforward_function.

Warning

This name should be understood as "(value and pushforward) function", and thus is not aligned with the reverse mode counterpart AbstractDifferentiation.value_and_pullback_function.

Vector-Jacobian products

This operation goes by a few names, like "pullback". Refer to the ChainRules documentation for more on terminology. For a single input, single output function f with a Jacobian J, the pullback operator pb_f is equivalent to applying the function v -> v' * J on a (co-tangent) vector v.

The following functions can be used to request the pullback operator/function with or without the function value. In order to request the pullback function pb_f of a function f at the inputs xs, you can use either of:

AbstractDifferentiation.pullback_functionFunction
AD.pullback_function(ab::AD.AbstractBackend, f, xs...)

Return the pullback function pbf of the function f at the inputs xs using backend ab.

The pullback function pbf accepts as input a Tuple of cotangents, one for each output of f. If f has a single output, pbf can also accept a single input instead of a 1-tuple.

Lazy operators

You can also get a struct for the lazy derivative/gradient/Jacobian/Hessian of a function. You can then use the * operator to apply the lazy operator on a value or tuple of the correct shape. To get a lazy derivative/gradient/Jacobian/Hessian use any one of:

AbstractDifferentiation.lazy_derivativeFunction
AD.lazy_derivative(ab::AbstractBackend, f, xs::Number...)

Return an operator ld for multiplying by the derivative of f at xs.

You can apply the operator by multiplication e.g. ld * y where y is a number if f has a single input, a tuple of the same length as xs if f has multiple inputs, or an array of numbers/tuples.

AbstractDifferentiation.lazy_gradientFunction
AD.lazy_gradient(ab::AbstractBackend, f, xs...)

Return an operator lg for multiplying by the gradient of f at xs.

You can apply the operator by multiplication e.g. lg * y where y is a number if f has a single input or a tuple of the same length as xs if f has multiple inputs.

AbstractDifferentiation.lazy_jacobianFunction
AD.lazy_jacobian(ab::AbstractBackend, f, xs...)

Return an operator lj for multiplying by the Jacobian of f at xs.

You can apply the operator by multiplication e.g. lj * y or y' * lj where y is a number, vector or tuple of numbers and/or vectors. If f has multiple inputs, y in lj * y should be a tuple. If f has multiple outputs, y in y' * lj should be a tuple. Otherwise, it should be a scalar or a vector of the appropriate length.

AbstractDifferentiation.lazy_hessianFunction
AD.lazy_hessian(ab::AbstractBackend, f, x)

Return an operator lh for multiplying by the Hessian of the scalar-valued function f at x.

You can apply the operator by multiplication e.g. lh * y or y' * lh where y is a number or a vector of the appropriate length.

Index