Supported gradient and hessian methods
PEtab.jl offers various gradient and Hessian methods that can be used to build a PEtabODEProblem
using createPEtabODEProblem()
. In this section, we will provide a brief overview of each method and the corresponding adjustable parameters.
Gradient methods
:ForwardDiff
: Uses ForwardDiff to compute the gradient via forward mode automatic differentiation. You can set the chunk size using thechunkSize
argument to improve performance. We plan to add automatic tuning for this in the future.:ForwardEquations
: Computes the gradient via the forward sensitivities. You can choose the method for computing sensitivities using thesensealg
argument. We support bothForwardSensitivity()
andForwardDiffSensitivity()
, which have adjustable options provided by SciMLSensitivity (see their documentation). The most efficient option issensealg=:ForwardDiff
though, which uses forward mode automatic differentiation to compute sensitivities.:Adjoint
: Computes the gradient via adjoint sensitivity analysis. You can choose between theInterpolatingAdjoint
andQuadratureAdjoint
methods from SciMLSensitivity (see their documentation) using thesensealg
argument. You can provide any options accepted by these methods.:Zygote
: Computes the gradient using the Zygote automatic differentiation library. You can choose any of the methods provided by SciMLSensitivity using thesensealg
argument.- Note: Because the code uses many for-loops,
:Zygote
is the slowest option and not recommended.
- Note: Because the code uses many for-loops,
Hessian methods
:ForwardDiff
: This method computes the Hessian via forward mode automatic differentiation using ForwardDiff. You can use thechunkSize
argument to set the chunk size, which can help improve performance. In the future, we plan to add automatic tuning for this parameter.:BlockForwardDiff
: This method computes a Hessian block approximation via forward mode automatic differentiation using ForwardDiff. For PEtab models, there are typically two sets of parameters to estimate: the parameters that are part of the ODE system $\theta_p$ and those that are not $\theta_q$. This method computes the Hessian for each block and assumes that cross-terms are zero-valued. The resulting Hessian block takes the form:
\[H_{block} = \begin{bmatrix} H_{p} & \mathbf{0} \\ \mathbf{0} & \mathbf{H}_q \end{bmatrix}\]
:GaussNewton
: This method computes a Hessian approximation using the Gauss-Newton method. It often performs better than a (L)-BFGS approximation, but requires access to sensitivities, which may only be feasible to compute for smaller models with 75 or fewer parameters.