# User guide

The list of functions on this page is the officially supported differentiation interface in `AbstractDifferentiation`

.

## Loading `AbstractDifferentiation`

To load `AbstractDifferentiation`

, it is recommended to use

`import AbstractDifferentiation as AD`

With the `AD`

alias you can access names inside of `AbstractDifferentiation`

using `AD.<>`

instead of typing the long name `AbstractDifferentiation`

.

`AbstractDifferentiation`

backends

To use `AbstractDifferentiation`

, first construct a backend instance `ab::AD.AbstractBackend`

using your favorite differentiation package in Julia that supports `AbstractDifferentiation`

.

Here's an example:

```
julia> import AbstractDifferentiation as AD, Zygote
julia> backend = AD.ZygoteBackend();
julia> f(x) = log(sum(exp, x));
julia> AD.gradient(backend, f, collect(1:3))
([0.09003057317038046, 0.2447284710547977, 0.665240955774822],)
```

The following backends are temporarily made available by `AbstractDifferentiation`

as soon as their corresponding package is loaded (thanks to weak dependencies on Julia ≥ 1.9 and Requires.jl on older Julia versions):

`AbstractDifferentiation.ReverseDiffBackend`

— Type`ReverseDiffBackend`

AD backend that uses reverse mode with ReverseDiff.jl.

To be able to use this backend, you have to load ReverseDiff.

`AbstractDifferentiation.ReverseRuleConfigBackend`

— Type`ReverseRuleConfigBackend`

AD backend that uses reverse mode with any ChainRulesCore.jl-compatible reverse-mode AD package.

Constructed with a `RuleConfig`

object:

`backend = AD.ReverseRuleConfigBackend(rc)`

On Julia >= 1.9, you have to load ChainRulesCore (possibly implicitly by loading a ChainRules-compatible AD package) to be able to use this backend.

`AbstractDifferentiation.FiniteDifferencesBackend`

— Type`FiniteDifferencesBackend{M}`

AD backend that uses forward mode with FiniteDifferences.jl.

The type parameter `M`

is the type of the method used to perform finite differences.

To be able to use this backend, you have to load FiniteDifferences.

`AbstractDifferentiation.ZygoteBackend`

— Function`ZygoteBackend()`

Create an AD backend that uses reverse mode with Zygote.jl.

It is a special case of `ReverseRuleConfigBackend`

.

To be able to use this backend, you have to load Zygote.

`AbstractDifferentiation.ForwardDiffBackend`

— Type`ForwardDiffBackend{CS}`

AD backend that uses forward mode with ForwardDiff.jl.

The type parameter `CS`

denotes the chunk size of the differentiation algorithm. If it is `Nothing`

, then ForwardiffDiff uses a heuristic to set the chunk size based on the input.

See also: ForwardDiff.jl: Configuring Chunk Size

To be able to use this backend, you have to load ForwardDiff.

`AbstractDifferentiation.TrackerBackend`

— Type`TrackerBackend`

AD backend that uses reverse mode with Tracker.jl.

To be able to use this backend, you have to load Tracker.

In the long term, these backend objects (and many more) will be defined within their respective packages to enforce the `AbstractDifferentiation`

interface. This is already the case for:

`Diffractor.DiffractorForwardBackend()`

for Diffractor.jl in forward mode

For higher order derivatives, you can build higher order backends using `AD.HigherOrderBackend`

.

`AbstractDifferentiation.HigherOrderBackend`

— Type`AD.HigherOrderBackend{B}`

Let `ab_f`

be a forward-mode automatic differentiation backend and let `ab_r`

be a reverse-mode automatic differentiation backend. To construct a higher order backend for doing forward-over-reverse-mode automatic differentiation, use `AD.HigherOrderBackend((ab_f, ab_r))`

. To construct a higher order backend for doing reverse-over-forward-mode automatic differentiation, use `AD.HigherOrderBackend((ab_r, ab_f))`

.

**Fields**

`backends::B`

## Derivatives

The following list of functions can be used to request the derivative, gradient, Jacobian, second derivative or Hessian without the function value.

`AbstractDifferentiation.derivative`

— Function`AD.derivative(ab::AD.AbstractBackend, f, xs::Number...)`

Compute the derivatives of `f`

with respect to the numbers `xs`

using the backend `ab`

.

The function returns a `Tuple`

of derivatives, one for each element in `xs`

.

`AbstractDifferentiation.gradient`

— Function`AD.gradient(ab::AD.AbstractBackend, f, xs...)`

Compute the gradients of `f`

with respect to the inputs `xs`

using the backend `ab`

.

The function returns a `Tuple`

of gradients, one for each element in `xs`

.

`AbstractDifferentiation.jacobian`

— Function`AD.jacobian(ab::AD.AbstractBackend, f, xs...)`

Compute the Jacobians of `f`

with respect to the inputs `xs`

using the backend `ab`

.

The function returns a `Tuple`

of Jacobians, one for each element in `xs`

.

`AbstractDifferentiation.second_derivative`

— Function`AD.second_derivative(ab::AD.AbstractBackend, f, x)`

Compute the second derivative of `f`

with respect to the input `x`

using the backend `ab`

.

The function returns a single value because `second_derivative`

currently only supports a single input.

`AbstractDifferentiation.hessian`

— Function`AD.hessian(ab::AD.AbstractBackend, f, x)`

Compute the Hessian of `f`

wrt the input `x`

using the backend `ab`

.

The function returns a single matrix because `hessian`

currently only supports a single input.

## Value and derivatives

The following list of functions can be used to request the function value along with its derivative, gradient, Jacobian, second derivative, or Hessian. You can also request the function value, its derivative (or its gradient) and its second derivative (or Hessian) for single-input functions.

`AbstractDifferentiation.value_and_derivative`

— Function`AD.value_and_derivative(ab::AD.AbstractBackend, f, xs::Number...)`

Return the tuple `(v, ds)`

of the function value `v = f(xs...)`

and the derivatives `ds = AD.derivative(ab, f, xs...)`

.

See also `AbstractDifferentiation.derivative`

.

`AbstractDifferentiation.value_and_gradient`

— Function`AD.value_and_gradient(ab::AD.AbstractBackend, f, xs...)`

Return the tuple `(v, gs)`

of the function value `v = f(xs...)`

and the gradients `gs = AD.gradient(ab, f, xs...)`

.

See also `AbstractDifferentiation.gradient`

.

`AbstractDifferentiation.value_and_jacobian`

— Function`AD.value_and_jacobian(ab::AD.AbstractBackend, f, xs...)`

Return the tuple `(v, Js)`

of the function value `v = f(xs...)`

and the Jacobians `Js = AD.jacobian(ab, f, xs...)`

.

See also `AbstractDifferentiation.jacobian`

.

`AbstractDifferentiation.value_and_second_derivative`

— Function`AD.value_and_second_derivative(ab::AD.AbstractBackend, f, x)`

Return the tuple `(v, d2)`

of the function value `v = f(x)`

and the second derivative `d2 = AD.second_derivative(ab, f, x)`

.

`AbstractDifferentiation.value_and_hessian`

— Function`AD.value_and_hessian(ab::AD.AbstractBackend, f, x)`

Return the tuple `(v, H)`

of the function value `v = f(x)`

and the Hessian `H = AD.hessian(ab, f, x)`

.

See also `AbstractDifferentiation.hessian`

.

`AbstractDifferentiation.value_derivative_and_second_derivative`

— Function`AD.value_derivative_and_second_derivative(ab::AD.AbstractBackend, f, x)`

Return the tuple `(v, d, d2)`

of the function value `v = f(x)`

, the first derivative `d = AD.derivative(ab, f, x)`

, and the second derivative `d2 = AD.second_derivative(ab, f, x)`

.

`AbstractDifferentiation.value_gradient_and_hessian`

— Function`AD.value_gradient_and_hessian(ab::AD.AbstractBackend, f, x)`

Return the tuple `(v, g, H)`

of the function value `v = f(x)`

, the gradient `g = AD.gradient(ab, f, x)`

, and the Hessian `H = AD.hessian(ab, f, x)`

.

See also `AbstractDifferentiation.gradient`

and `AbstractDifferentiation.hessian`

.

## Jacobian-vector products

This operation goes by a few names, like "pushforward". Refer to the ChainRules documentation for more on terminology. For a single input, single output function `f`

with a Jacobian `J`

, the pushforward operator `pf_f`

is equivalent to applying the function `v -> J * v`

on a (tangent) vector `v`

.

The following functions can be used to request a function that returns the pushforward operator/function. In order to request the pushforward function `pf_f`

of a function `f`

at the inputs `xs`

, you can use either of:

`AbstractDifferentiation.pushforward_function`

— Function`AD.pushforward_function(ab::AD.AbstractBackend, f, xs...)`

Return the pushforward function `pff`

of the function `f`

at the inputs `xs`

using backend `ab`

.

The pushfoward function `pff`

accepts as input a `Tuple`

of tangents, one for each element in `xs`

. If `xs`

consists of a single element, `pff`

can also accept a single tangent instead of a 1-tuple.

`AbstractDifferentiation.value_and_pushforward_function`

— Function`AD.value_and_pushforward_function(ab::AD.AbstractBackend, f, xs...)`

Return a single function `vpff`

which, given tangents `ts`

, computes the tuple `(v, p) = vpff(ts)`

composed of

- the function value
`v = f(xs...)`

- the pushforward value
`p = pff(ts)`

given by the pushforward function`pff = AD.pushforward_function(ab, f, xs...)`

applied to`ts`

.

See also `AbstractDifferentiation.pushforward_function`

.

This name should be understood as "(value and pushforward) function", and thus is not aligned with the reverse mode counterpart `AbstractDifferentiation.value_and_pullback_function`

.

## Vector-Jacobian products

This operation goes by a few names, like "pullback". Refer to the ChainRules documentation for more on terminology. For a single input, single output function `f`

with a Jacobian `J`

, the pullback operator `pb_f`

is equivalent to applying the function `v -> v' * J`

on a (co-tangent) vector `v`

.

The following functions can be used to request the pullback operator/function with or without the function value. In order to request the pullback function `pb_f`

of a function `f`

at the inputs `xs`

, you can use either of:

`AbstractDifferentiation.pullback_function`

— Function`AD.pullback_function(ab::AD.AbstractBackend, f, xs...)`

Return the pullback function `pbf`

of the function `f`

at the inputs `xs`

using backend `ab`

.

The pullback function `pbf`

accepts as input a `Tuple`

of cotangents, one for each output of `f`

. If `f`

has a single output, `pbf`

can also accept a single input instead of a 1-tuple.

`AbstractDifferentiation.value_and_pullback_function`

— Function`AD.value_and_pullback_function(ab::AD.AbstractBackend, f, xs...)`

Return a tuple `(v, pbf)`

of the function value `v = f(xs...)`

and the pullback function `pbf = AD.pullback_function(ab, f, xs...)`

.

See also `AbstractDifferentiation.pullback_function`

.

This name should be understood as "value and (pullback function)", and thus is not aligned with the forward mode counterpart `AbstractDifferentiation.value_and_pushforward_function`

.

## Lazy operators

You can also get a struct for the lazy derivative/gradient/Jacobian/Hessian of a function. You can then use the `*`

operator to apply the lazy operator on a value or tuple of the correct shape. To get a lazy derivative/gradient/Jacobian/Hessian use any one of:

`AbstractDifferentiation.lazy_derivative`

— Function`AD.lazy_derivative(ab::AbstractBackend, f, xs::Number...)`

Return an operator `ld`

for multiplying by the derivative of `f`

at `xs`

.

You can apply the operator by multiplication e.g. `ld * y`

where `y`

is a number if `f`

has a single input, a tuple of the same length as `xs`

if `f`

has multiple inputs, or an array of numbers/tuples.

`AbstractDifferentiation.lazy_gradient`

— Function`AD.lazy_gradient(ab::AbstractBackend, f, xs...)`

Return an operator `lg`

for multiplying by the gradient of `f`

at `xs`

.

You can apply the operator by multiplication e.g. `lg * y`

where `y`

is a number if `f`

has a single input or a tuple of the same length as `xs`

if `f`

has multiple inputs.

`AbstractDifferentiation.lazy_jacobian`

— Function`AD.lazy_jacobian(ab::AbstractBackend, f, xs...)`

Return an operator `lj`

for multiplying by the Jacobian of `f`

at `xs`

.

You can apply the operator by multiplication e.g. `lj * y`

or `y' * lj`

where `y`

is a number, vector or tuple of numbers and/or vectors. If `f`

has multiple inputs, `y`

in `lj * y`

should be a tuple. If `f`

has multiple outputs, `y`

in `y' * lj`

should be a tuple. Otherwise, it should be a scalar or a vector of the appropriate length.

`AbstractDifferentiation.lazy_hessian`

— Function`AD.lazy_hessian(ab::AbstractBackend, f, x)`

Return an operator `lh`

for multiplying by the Hessian of the scalar-valued function `f`

at `x`

.

You can apply the operator by multiplication e.g. `lh * y`

or `y' * lh`

where `y`

is a number or a vector of the appropriate length.

## Index

`AbstractDifferentiation.FiniteDifferencesBackend`

`AbstractDifferentiation.ForwardDiffBackend`

`AbstractDifferentiation.HigherOrderBackend`

`AbstractDifferentiation.ReverseDiffBackend`

`AbstractDifferentiation.ReverseRuleConfigBackend`

`AbstractDifferentiation.TrackerBackend`

`AbstractDifferentiation.ZygoteBackend`

`AbstractDifferentiation.derivative`

`AbstractDifferentiation.gradient`

`AbstractDifferentiation.hessian`

`AbstractDifferentiation.jacobian`

`AbstractDifferentiation.lazy_derivative`

`AbstractDifferentiation.lazy_gradient`

`AbstractDifferentiation.lazy_hessian`

`AbstractDifferentiation.lazy_jacobian`

`AbstractDifferentiation.pullback_function`

`AbstractDifferentiation.pushforward_function`

`AbstractDifferentiation.second_derivative`

`AbstractDifferentiation.value_and_derivative`

`AbstractDifferentiation.value_and_gradient`

`AbstractDifferentiation.value_and_hessian`

`AbstractDifferentiation.value_and_jacobian`

`AbstractDifferentiation.value_and_pullback_function`

`AbstractDifferentiation.value_and_pushforward_function`

`AbstractDifferentiation.value_and_second_derivative`

`AbstractDifferentiation.value_derivative_and_second_derivative`

`AbstractDifferentiation.value_gradient_and_hessian`