
An interface to various automatic differentiation backends in Julia.



Sparsity pattern detector satisfying the detection API of ADTypes.jl.

The nonzeros in a Jacobian or Hessian are detected by computing the relevant matrix with dense AD, and thresholding the entries with a given tolerance (which can be numerically inaccurate).


This detector can be very slow, and should only be used if its output can be exploited multiple times to compute many sparse matrices.


In general, the sparsity pattern you obtain can depend on the provided input x. If you want to reuse the pattern, make sure that it is input-agnostic.


  • backend::AbstractADType is the dense AD backend used under the hood
  • atol::Float64 is the minimum magnitude of a matrix entry to be considered nonzero


DenseSparsityDetector(backend; atol, method=:iterative)

The keyword argument method::Symbol can be either:

  • :iterative: compute the matrix in a sequence of matrix-vector products (memory-efficient)
  • :direct: compute the matrix all at once (memory-hungry but sometimes faster).

Note that the constructor is type-unstable because method ends up being a type parameter of the DenseSparsityDetector object (this is not part of the API and might change).


using ADTypes, DifferentiationInterface, SparseArrays
import ForwardDiff

detector = DenseSparsityDetector(AutoForwardDiff(); atol=1e-5, method=:direct)

ADTypes.jacobian_sparsity(diff, rand(5), detector)

# output

4×5 SparseMatrixCSC{Bool, Int64} with 8 stored entries:
 1  1  ⋅  ⋅  ⋅
 ⋅  1  1  ⋅  ⋅
 ⋅  ⋅  1  1  ⋅
 ⋅  ⋅  ⋅  1  1

Sometimes the sparsity pattern is input-dependent:

ADTypes.jacobian_sparsity(x -> [prod(x)], rand(2), detector)

# output

1×2 SparseMatrixCSC{Bool, Int64} with 2 stored entries:
 1  1
ADTypes.jacobian_sparsity(x -> [prod(x)], [0, 1], detector)

# output

1×2 SparseMatrixCSC{Bool, Int64} with 1 stored entry:
 1  ⋅

Callable function wrapper that enforces differentiation with a specified (inner) backend.

This works by defining new rules overriding the behavior of the outer backend that would normally be used.


This is an experimental functionality, whose API cannot yet be considered stable. At the moment, it only supports one-argument functions, and rules are only defined for ChainRules.jl-compatible outer backends.


  • f: the function in question
  • backend::AbstractADType: the inner backend to use for differentiation


DifferentiateWith(f, backend)


using DifferentiationInterface
import ForwardDiff, Zygote

function f(x)
    a = Vector{eltype(x)}(undef, 1)
    a[1] = sum(x)  # mutation that breaks Zygote
    return a[1]

dw = DifferentiateWith(f, AutoForwardDiff());

gradient(dw, AutoZygote(), [2.0])  # calls ForwardDiff instead

# output

1-element Vector{Float64}:

Functor computing the gradient of f with a fixed backend.


This type is not part of the public API.


Gradient(f, backend, extras=nothing)

If extras is provided, the gradient closure will skip preparation.


using DifferentiationInterface
import Zygote

g = DifferentiationInterface.Gradient(x -> sum(abs2, x), AutoZygote())
g([2.0, 3.0])

# output

2-element Vector{Float64}:

Combination of two backends for second-order differentiation.


SecondOrder backends do not support first-order operators.


SecondOrder(outer_backend, inner_backend)


  • outer::ADTypes.AbstractADType: backend for the outer differentiation

  • inner::ADTypes.AbstractADType: backend for the inner differentiation


Storage for B (co)tangents (NTuple wrapper).

Tangents{B} with B > 1 can be used as seed to trigger batched-mode pushforward, pullback and hvp.


  • d::NTuple{B}

Return the outer mode of the second-order backend.

basis(backend, a::AbstractArray, i::CartesianIndex)

Construct the i-th standard basis array in the vector space of a with element type eltype(a).


If an AD backend benefits from a more specialized basis array implementation, this function can be extended on the backend type.


Check whether backend supports second order differentiation by trying to compute a hessian.


Might take a while due to compilation time.

derivative(f,     backend, x, [extras]) -> der
derivative(f!, y, backend, x, [extras]) -> der

Compute the derivative of the function f at point x.

To improve performance via operator preparation, refer to prepare_derivative.

derivative!(f,     der, backend, x, [extras]) -> der
derivative!(f!, y, der, backend, x, [extras]) -> der

Compute the derivative of the function f at point x, overwriting der.

To improve performance via operator preparation, refer to prepare_derivative.

hessian!(f, hess, backend, x, [extras]) -> hess

Compute the Hessian matrix of the function f at point x, overwriting hess.

To improve performance via operator preparation, refer to prepare_hessian.

jacobian(f,     backend, x, [extras]) -> jac
jacobian(f!, y, backend, x, [extras]) -> jac

Compute the Jacobian matrix of the function f at point x.

To improve performance via operator preparation, refer to prepare_jacobian.

jacobian!(f,     jac, backend, x, [extras]) -> jac
jacobian!(f!, y, jac, backend, x, [extras]) -> jac

Compute the Jacobian matrix of the function f at point x, overwriting jac.

To improve performance via operator preparation, refer to prepare_jacobian.

multibasis(backend, a::AbstractArray, inds::AbstractVector{<:CartesianIndex})

Construct the sum of the i-th standard basis arrays in the vector space of a with element type eltype(a), for all i ∈ inds.


If an AD backend benefits from a more specialized basis array implementation, this function can be extended on the backend type.


Return a possibly modified backend that can work while nested inside another differentiation procedure.

At the moment, this is only useful for Enzyme, which needs autodiff_deferred to be compatible with higher-order differentiation.

pick_batchsize(backend::AbstractADType, dimension::Integer)

Pick a reasonable batch size for batched derivative evaluation with a given total dimension.

Returns 1 for backends which have not overloaded it.

prepare_derivative(f,     backend, x) -> extras
prepare_derivative(f!, y, backend, x) -> extras

Create an extras object that can be given to derivative and its variants.


If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again. In the two-argument case, y is mutated by f! during preparation.

prepare_gradient(f, backend, x) -> extras

Create an extras object that can be given to gradient and its variants.


If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again.

prepare_hessian(f, backend, x) -> extras

Create an extras object that can be given to hessian and its variants.


If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again.

prepare_hvp(f, backend, x, dx) -> extras

Create an extras object that can be given to hvp and its variants.


If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again.

prepare_hvp_same_point(f, backend, x, dx) -> extras_same

Create an extras_same object that can be given to hvp and its variants if they are applied at the same point x.


If the function or the point changes in any way, the result of preparation will be invalidated, and you will need to run it again.

prepare_jacobian(f,     backend, x) -> extras
prepare_jacobian(f!, y, backend, x) -> extras

Create an extras object that can be given to jacobian and its variants.


If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again. In the two-argument case, y is mutated by f! during preparation.

prepare_pullback(f,     backend, x, dy) -> extras
prepare_pullback(f!, y, backend, x, dy) -> extras

Create an extras object that can be given to pullback and its variants.


If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again. In the two-argument case, y is mutated by f! during preparation.

prepare_pullback_same_point(f,     backend, x, dy) -> extras_same
prepare_pullback_same_point(f!, y, backend, x, dy) -> extras_same

Create an extras_same object that can be given to pullback and its variants if they are applied at the same point x.


If the function or the point changes in any way, the result of preparation will be invalidated, and you will need to run it again. In the two-argument case, y is mutated by f! during preparation.

prepare_pushforward(f,     backend, x, dx) -> extras
prepare_pushforward(f!, y, backend, x, dx) -> extras

Create an extras object that can be given to pushforward and its variants.


If the function changes in any way, the result of preparation will be invalidated, and you will need to run it again. In the two-argument case, y is mutated by f! during preparation.

prepare_pushforward_same_point(f,     backend, x, dx) -> extras_same
prepare_pushforward_same_point(f!, y, backend, x, dx) -> extras_same

Create an extras_same object that can be given to pushforward and its variants if they are applied at the same point x.


If the function or the point changes in any way, the result of preparation will be invalidated, and you will need to run it again. In the two-argument case, y is mutated by f! during preparation.

pullback(f,     backend, x, dy, [extras]) -> dx
pullback(f!, y, backend, x, dy, [extras]) -> dx

Compute the pullback of the function f at point x with seed dy.

To improve performance via operator preparation, refer to prepare_pullback and prepare_pullback_same_point.


Pullbacks are also commonly called vector-Jacobian products or VJPs. This function could have been named vjp.

pullback!(f,     dx, backend, x, dy, [extras]) -> dx
pullback!(f!, y, dx, backend, x, dy, [extras]) -> dx

Compute the pullback of the function f at point x with seed dy, overwriting dx.

To improve performance via operator preparation, refer to prepare_pullback and prepare_pullback_same_point.


Pullbacks are also commonly called vector-Jacobian products or VJPs. This function could have been named vjp!.

pushforward(f,     backend, x, dx, [extras]) -> dy
pushforward(f!, y, backend, x, dx, [extras]) -> dy

Compute the pushforward of the function f at point x with seed dx.

To improve performance via operator preparation, refer to prepare_pushforward and prepare_pushforward_same_point.


Pushforwards are also commonly called Jacobian-vector products or JVPs. This function could have been named jvp.

pushforward!(f,     dy, backend, x, dx, [extras]) -> dy
pushforward!(f!, y, dy, backend, x, dx, [extras]) -> dy

Compute the pushforward of the function f at point x with seed dx, overwriting dy.

To improve performance via operator preparation, refer to prepare_pushforward and prepare_pushforward_same_point.


Pushforwards are also commonly called Jacobian-vector products or JVPs. This function could have been named jvp!.

value_and_derivative(f,     backend, x, [extras]) -> (y, der)
value_and_derivative(f!, y, backend, x, [extras]) -> (y, der)

Compute the value and the derivative of the function f at point x.

To improve performance via operator preparation, refer to prepare_derivative.

value_and_derivative!(f,     der, backend, x, [extras]) -> (y, der)
value_and_derivative!(f!, y, der, backend, x, [extras]) -> (y, der)

Compute the value and the derivative of the function f at point x, overwriting der.

To improve performance via operator preparation, refer to prepare_derivative.

value_and_jacobian(f,     backend, x, [extras]) -> (y, jac)
value_and_jacobian(f!, y, backend, x, [extras]) -> (y, jac)

Compute the value and the Jacobian matrix of the function f at point x.

To improve performance via operator preparation, refer to prepare_jacobian.

value_and_jacobian!(f,     jac, backend, x, [extras]) -> (y, jac)
value_and_jacobian!(f!, y, jac, backend, x, [extras]) -> (y, jac)

Compute the value and the Jacobian matrix of the function f at point x, overwriting jac.

To improve performance via operator preparation, refer to prepare_jacobian.

value_and_pullback(f,     backend, x, dy, [extras]) -> (y, dx)
value_and_pullback(f!, y, backend, x, dy, [extras]) -> (y, dx)

Compute the value and the pullback of the function f at point x with seed dy.

To improve performance via operator preparation, refer to prepare_pullback and prepare_pullback_same_point.


Pullbacks are also commonly called vector-Jacobian products or VJPs. This function could have been named value_and_vjp.


Required primitive for reverse mode backends.

value_and_pullback!(f,     dx, backend, x, dy, [extras]) -> (y, dx)
value_and_pullback!(f!, y, dx, backend, x, dy, [extras]) -> (y, dx)

Compute the value and the pullback of the function f at point x with seed dy, overwriting dx.

To improve performance via operator preparation, refer to prepare_pullback and prepare_pullback_same_point.


Pullbacks are also commonly called vector-Jacobian products or VJPs. This function could have been named value_and_vjp!.

value_and_pushforward(f,     backend, x, dx, [extras]) -> (y, dy)
value_and_pushforward(f!, y, backend, x, dx, [extras]) -> (y, dy)

Compute the value and the pushforward of the function f at point x with seed dx.

To improve performance via operator preparation, refer to prepare_pushforward and prepare_pushforward_same_point.


Pushforwards are also commonly called Jacobian-vector products or JVPs. This function could have been named value_and_jvp.


Required primitive for forward mode backends.

value_and_pushforward!(f,     dy, backend, x, dx, [extras]) -> (y, dy)
value_and_pushforward!(f!, y, dy, backend, x, dx, [extras]) -> (y, dy)

Compute the value and the pushforward of the function f at point x with seed dx, overwriting dy.

To improve performance via operator preparation, refer to prepare_pushforward and prepare_pushforward_same_point.


Pushforwards are also commonly called Jacobian-vector products or JVPs. This function could have been named value_and_jvp!.