MCMCDiagnosticTools

Effective sample size and potential scale reduction

The effective sample size (ESS) and the potential scale reduction can be estimated with ess_rhat.

MCMCDiagnosticTools.ess_rhat — Function

ess_rhat(
    samples::AbstractArray{<:Union{Missing,Real},3}; method=ESSMethod(), maxlag=250
)

Estimate the effective sample size and the potential scale reduction of the samples of shape (draws, parameters, chains) with the method and a maximum lag of maxlag.

The following methods are supported:

MCMCDiagnosticTools.ESSMethod — Type

ESSMethod <: AbstractESSMethod

The ESSMethod uses a standard algorithm for estimating the effective sample size of MCMC chains.

It is is based on the discussion by Vehtari et al. and uses the biased estimator of the autocovariance, as discussed by Geyer. In contrast to Geyer, the divisor n - 1 is used in the estimation of the autocovariance to obtain the unbiased estimator of the variance for lag 0.

References

Geyer, C. J. (1992). Practical Markov Chain Monte Carlo. Statistical Science, 473-483.

Vehtari, A., Gelman, A., Simpson, D., Carpenter, B., & Bürkner, P. C. (2021). Rank-normalization, folding, and localization: An improved $\widehat {R}$ for assessing convergence of MCMC. Bayesian Analysis.

MCMCDiagnosticTools.FFTESSMethod — Type

FFTESSMethod <: AbstractESSMethod

The FFTESSMethod uses a standard algorithm for estimating the effective sample size of MCMC chains.

The algorithm is the same as the one of ESSMethod but this method uses fast Fourier transforms (FFTs) for estimating the autocorrelation.

Info

To be able to use this method, you have to load a package that implements the AbstractFFTs.jl interface such as FFTW.jl or FastTransforms.jl.

MCMCDiagnosticTools.BDAESSMethod — Type

BDAESSMethod <: AbstractESSMethod

The BDAESSMethod uses a standard algorithm for estimating the effective sample size of MCMC chains.

It is is based on the discussion by Vehtari et al. and uses the variogram estimator of the autocorrelation function discussed by Gelman et al.

References

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). Bayesian data analysis. CRC press.

Monte Carlo standard error

MCMCDiagnosticTools.mcse — Function

mcse(x::AbstractVector{<:Real}; method::Symbol=:imse, kwargs...)

Compute the Monte Carlo standard error (MCSE) of samples x. The optional argument method describes how the errors are estimated. Possible options are:

:bm for batch means ^[Glynn1991]
:imse initial monotone sequence estimator ^[Geyer1992]
:ipse initial positive sequence estimator ^[Geyer1992]

R⋆ diagnostic

MCMCDiagnosticTools.rstar — Function

rstar(
    rng=Random.GLOBAL_RNG,
    classifier,
    samples::AbstractMatrix,
    chain_indices::AbstractVector{Int};
    subset::Real=0.8,
    verbosity::Int=0,
)

Compute the $R^*$ convergence statistic of the samples with shape (draws, parameters) and corresponding chains chain_indices with the classifier.

This implementation is an adaption of algorithms 1 and 2 described by Lambert and Vehtari.

The classifier has to be a supervised classifier of the MLJ framework (see the MLJ documentation for a list of supported models). It is trained with a subset of the samples. The training of the classifier can be inspected by adjusting the verbosity level.

If the classifier is deterministic, i.e., if it predicts a class, the value of the $R^*$ statistic is returned (algorithm 1). If the classifier is probabilistic, i.e., if it outputs probabilities of classes, the scaled Poisson-binomial distribution of the $R^*$ statistic is returned (algorithm 2).

Note

The correctness of the statistic depends on the convergence of the classifier used internally in the statistic.

Examples

julia> using MLJBase, MLJXGBoostInterface, Statistics

julia> samples = fill(4.0, 300, 2);

julia> chain_indices = repeat(1:3; outer=100);

One can compute the distribution of the $R^*$ statistic (algorithm 2) with the probabilistic classifier.

julia> distribution = rstar(XGBoostClassifier(), samples, chain_indices);

julia> isapprox(mean(distribution), 1; atol=0.1)
true

For deterministic classifiers, a single $R^*$ statistic (algorithm 1) is returned. Deterministic classifiers can also be derived from probabilistic classifiers by e.g. predicting the mode. In MLJ this corresponds to a pipeline of models.

julia> @pipeline XGBoostClassifier name = XGBoostDeterministic operation = predict_mode;

julia> value = rstar(XGBoostDeterministic(), samples, chain_indices);

julia> isapprox(value, 1; atol=0.1)
true

References

Lambert, B., & Vehtari, A. (2020). $R^*$: A robust MCMC convergence diagnostic with uncertainty using decision tree classifiers.

Other diagnostics

MCMCDiagnosticTools.discretediag — Function

discretediag(chains::AbstractArray{<:Real,3}; frac=0.3, method=:weiss, nsim=1_000)

Compute discrete diagnostic where method can be one of :weiss, :hangartner, :DARBOOT, :MCBOOT, :billinsgley, and :billingsleyBOOT.

MCMCDiagnosticTools.gelmandiag — Function

gelmandiag(chains::AbstractArray{<:Real,3}; alpha::Real=0.95)

Compute the Gelman, Rubin and Brooks diagnostics.

MCMCDiagnosticTools.gelmandiag_multivariate — Function

gelmandiag_multivariate(chains::AbstractArray{<:Real,3}; alpha::Real=0.05)

Compute the multivariate Gelman, Rubin and Brooks diagnostics.

MCMCDiagnosticTools.gewekediag — Function

gewekediag(x::AbstractVector{<:Real}; first::Real=0.1, last::Real=0.5, kwargs...)

Compute the Geweke diagnostic from the first and last proportion of samples x.

MCMCDiagnosticTools.heideldiag — Function

heideldiag(
    x::AbstractVector{<:Real}; alpha::Real=0.05, eps::Real=0.1, start::Int=1, kwargs...
)

Compute the Heidelberger and Welch diagnostic.

MCMCDiagnosticTools.rafterydiag — Function

rafterydiag(x::AbstractVector{<:Real}; q, r, s, eps, range)

Compute the Raftery and Lewis diagnostic.

Glynn1991Glynn, P. W., & Whitt, W. (1991). Estimating the asymptotic variance with batch means. Operations Research Letters, 10(8), 431-435.
Geyer1992Geyer, C. J. (1992). Practical Markov Chain Monte Carlo. Statistical Science, 473-483.