Conjugate updates

In an MCMC setting it is sometimes possible to employ certain types of efficient updates called "conjugate updates". They can be employed if the likelihood and the prior fit together like a glove, allowing to pin down the posterior density by identifying it with one of the standard distributions. For the parameters appearing in the diffusion's drift such prior–posterior pair can sometimes be taken to be Gaussian–Gaussian.

This package does not deal with MCMC sampling for diffusions; however, it defines macros that automatically build functions that are required for conjugate updates in DiffusionMCMC.jl. Below, we explain only how the macros introduced in DiffusionDefinition.jl work and give no further details about conjugate updates. To see more details about on this topic see [...].

Conjugate Gausian updates

To perform conjugate Gaussian updates in DiffusionMCMC.jl the following functions need to be defines for your diffusion law:

phi(::Val{T}, t, x, P) where T

If the drift can be written in a form:

\[b_θ(t, x) = θ^Tφ(t,x)+ϕ(t,x)\]

for parameters $θ$ that are being updated, then phi is a row of the matrix $φ(t,x)$ that get's multiplied by the coordinate of a θ vector with a name T.

num_non_hypo(Ptype::Type{T}) where T <: DiffusionProcess

Return a total number of coordinates of the diffusion process that have non-degenerate noise structure. I.e. it is equal to the total number of coordinates of the process minus the number of coordinates that have no direct Wiener contribution.

hypo_a_inv(t, x, P)

Similar to $a^{-1}:=(σσ^T)^{-1}$, with the only exception that all of the zero rows of $σ$ are first removed to yield $\hat{σ}$, and then, $(\hat{σ} \hat{σ}^T)^{-1}$ is computed.

nonhypo(x, P)

Return the diffusion's coordinates that have direct contribution from some non-degenerate Wiener terms, i.e. leave out coordinates whose Wiener term is zero.

ignore_for_cu(::Val{T}, P) where T

A flag to indicate that a parameter with name T has no contribution to the computation of conjugate updates.

In this package we implement a macro that makes this definition process more convenient

@conjugate_gaussian DIFFUSION_NAME begin

Define helper functions for the conjugate gaussian updates of diffusion DIFFUSION_NAME. In BODY information can be passed as a list with elements of the form:

:entry-name --> entry-value

The following entry names have special meaning: :inplace, :nonhypo, :hypo_a_inv. Any other name is interpreted as a parameter name for which function phi is defined. More precisely, the following can be defined:


:inplace --> true

to indicate that compmutations are to be done in-place. Otherwise, by default they are assumed to be done out-of-place.


:nonhypo --> list-of-indices-with-non-degenerate-noise


:nonhypo --> [1,3,5]

to indicate which coordinates have non-degenerate (i.e. non-zero) Wiener term Wiener term. It automatically defines num_non_hypo for the user.


If no flag :inplace --> true is passed, then it will be assumed that computations are done out-of-place by default and then no matter what subset you passed it will be internally saved as an SVector so as to work as a static way of accessing a subset of parameters. Otherwise, no change will be made to list-of-indices-with-non-degenerate-noise.


:hypo_a_inv --> f(t,x,P)

where f is a function that uses time variable t, state variable x and a struct with the target diffusion law P.

phi and ignore_for_cu

:name-of-parameter --> (φ₁(t,x,P), φ₂(t,x,P), ...)

where φ₁(t,x,P) corresponds to all terms from the drift that are in the first non-degenerate coordinate and that are multiplied by the parameter with the name name-of-parameter, etc. Any parameters of P for which φ is not defined will have a function ignore_for_cu defined for them automatically and returning true.


For instance, for the FitzHugh–Nagumo model in the conjugate-parameterization form from here, we can define functions for the conjugate updates as follows:

@conjugate_gaussian FitzHughNagumoConjug begin
    :intercept --> (-x[2],)
    :ϵ --> (x[1]-x[1]^3+(1-3*x[1]^2)*x[2],)
    :s --> (one(x[1]),)
    :γ --> (-x[1],)
    :β --> (-one(x[1]),)
    :hypo_a_inv --> 1.0/P.σ^2
    :nonhypo --> 2:2