AbstractMCMC.jl
Abstract types and interfaces for Markov chain Monte Carlo methods.
Overview
AbstractMCMC defines an interface for sampling and combining Markov chains. It comes with a default sampling algorithm that provides support of progress bars, parallel sampling (multithreaded and multicore), and user-provided callbacks out of the box. Typically developers only have to define the sampling step of their inference method in an iterator-like fashion to make use of this functionality. Additionally, the package defines an iterator and a transducer for sampling Markov chains based on the interface.
User-facing API
The user-facing sampling API consists of
StatsBase.sample(
[rng::Random.AbstractRNG,]
model::AbstractMCMC.AbstractModel,
sampler::AbstractMCMC.AbstractSampler,
nsamples[;
kwargs...]
)
and
StatsBase.sample(
[rng::Random.AbstractRNG,]
model::AbstractMCMC.AbstractModel,
sampler::AbstractMCMC.AbstractSampler,
parallel::AbstractMCMC.AbstractMCMCParallel,
nsamples::Integer,
nchains::Integer[;
kwargs...]
)
for regular and parallel sampling, respectively. In regular sampling, users may provide a function
isdone(rng, model, sampler, samples, iteration; kwargs...)
that returns true
when sampling should end, and false
otherwise, instead of
a fixed number of samples nsamples
. AbstractMCMC defines the abstract types
AbstractMCMC.AbstractModel
, AbstractMCMC.AbstractSampler
, and
AbstractMCMC.AbstractMCMCParallel
for models, samplers, and parallel sampling
algorithms, respectively. Two algorithms MCMCThreads
and MCMCDistributed
are provided for parallel sampling with multiple threads and multiple processes,
respectively.
The function
AbstractMCMC.steps([rng::AbstractRNG, ]model::AbstractModel, sampler::AbstractSampler[; kwargs...])
returns an iterator that returns samples continuously, without a predefined stopping condition. Similarly,
AbstractMCMC.Sample([rng::Random.AbstractRNG, ]model::AbstractModel, sampler::AbstractSampler[; kwargs...])
returns a transducer that returns samples continuously.
Common keyword arguments for regular and parallel sampling (not supported by the iterator and transducer) are:
progress
(default:AbstractMCMC.PROGRESS[]
which istrue
initially): toggles progress loggingchain_type
(default:Any
): determines the type of the returned chaincallback
(default:nothing
): ifcallback !== nothing
, thencallback(rng, model, sampler, sample, iteration)
is called after every sampling step, wheresample
is the most recent sample of the Markov chain anditeration
is the current iterationdiscard_initial
(default:0
): number of initial samples that are discardedthinning
(default:1
): factor by which to thin samples.
Progress logging can be enabled and disabled globally with AbstractMCMC.setprogress!(progress)
.
Additionally, AbstractMCMC defines the abstract type AbstractChains
for Markov chains and the
method AbstractMCMC.chainscat(::AbstractChains...)
for concatenating multiple chains.
(defaults to cat(::AbstractChains...; dims = 3)
).
Note that AbstractMCMC exports only MCMCThreads
and MCMCDistributed
(and in
particular not StatsBase.sample
).
Developer documentation: Default implementation
AbstractMCMC provides a default implementation of the user-facing interface described above. You can completely neglect these and define your own implementation of the interface. However, as described below, in most use cases the default implementation allows you to obtain support of parallel sampling, progress logging, callbacks, iterators, and transducers for free by just defining the sampling step of your inference algorithm, drastically reducing the amount of code you have to write. In general, the docstrings of the functions described below might be helpful if you intend to make use of the default implementations.
Basic structure
The simplified structure for regular sampling (the actual implementation contains some additional error checks and support for progress logging and callbacks) is
StatsBase.sample(
rng::Random.AbstractRNG,
model::AbstractMCMC.AbstractModel,
sampler::AbstractMCMC.AbstractSampler,
nsamples::Integer;
chain_type = ::Type{Any},
kwargs...
)
# Obtain the initial sample and state.
sample, state = AbstractMCMC.step(rng, model, sampler; kwargs...)
# Save the sample.
samples = AbstractMCMC.samples(sample, model, sampler, N; kwargs...)
samples = AbstractMCMC.save!!(samples, sample, 1, model, sampler, N; kwargs...)
# Step through the sampler.
for i in 2:N
# Obtain the next sample and state.
sample, state = AbstractMCMC.step(rng, model, sampler, state; kwargs...)
# Save the sample.
samples = AbstractMCMC.save!!(samples, sample, i, model, sampler, N; kwargs...)
end
return AbstractMCMC.bundle_samples(samples, model, sampler, state, chain_type; kwargs...)
end
All other default implementations make use of the same structure and in particular call the same methods.
Sampling step
The only method for which no default implementation is provided (and hence which
downstream packages have to implement) is AbstractMCMC.step
that defines the sampling step of the inference method. In the initial step it is
called as
AbstractMCMC.step(rng, model, sampler; kwargs...)
whereas in all subsequent steps it is called as
AbstractMCMC.step(rng, model, sampler, state; kwargs...)
where state
denotes the current state of the sampling algorithm. It should return
a 2-tuple consisting of the next sample and the updated state of the sampling algorithm.
Hence AbstractMCMC.step
can be viewed as an extended version of
Base.iterate
with additional positional and keyword arguments.
Collecting samples (does not apply to the iterator and transducer)
After the initial sample is obtained, the default implementations for regular and parallel sampling
(not for the iterator and the transducer since it is not needed there) create a container for all
samples (the initial one and all subsequent samples) using AbstractMCMC.samples
. By default,
AbstractMCMC.samples
just returns a concretely typed Vector
with the initial sample as single
entry. If the total number of samples is fixed, we use sizehint!
to suggest that the container
reserves capacity for all samples to improve performance.
In each step, the sample is saved in the container by AbstractMCMC.save!!
. The notation !!
follows the convention of the package BangBang.jl
which is used in the default implementation of AbstractMCMC.save!!
. It indicates that the
sample is pushed to the container but a "widening" fallback is used if the container type
does not allow to save the sample. Therefore AbstractMCMC.save!!
always has to return the container.
For most use cases the default implementation of AbstractMCMC.samples
and AbstractMCMC.save!!
should work out of the box and hence need not to be overloaded in downstream code. Please have
a look at the docstrings of AbstractMCMC.samples
and AbstractMCMC.save!!
if you intend
to overload these functions.
Creating chains (does not apply to the iterator and transducer)
At the end of the sampling procedure for regular and paralle sampling (not for the iterator and the transducer) we transform the collection of samples to the desired output type by calling
AbstractMCMC.bundle_samples(samples, model, sampler, state, chain_type; kwargs...)
where samples
is the collection of samples, state
is the final state of the sampler,
and chain_type
is the desired return type. The default implementation in AbstractMCMC
just returns the collection samples
.
The default implementation should be fine in most use cases, but downstream packages
could, e.g., save the final state of the sampler as well if they overload AbstractMCMC.bundle_samples
.