ParetoSmooth

Documentation for ParetoSmooth.

ParetoSmooth.Psis
ParetoSmooth._assume_one_chain
ParetoSmooth._check_input_validity_psis
ParetoSmooth._check_tail
ParetoSmooth._convert_to_array
ParetoSmooth._generate_r_eff
ParetoSmooth.def_tail_length
ParetoSmooth.do_psis_i!
ParetoSmooth.gpd_quantile
ParetoSmooth.gpdfit
ParetoSmooth.loo
ParetoSmooth.psis
ParetoSmooth.psis_ess
ParetoSmooth.psis_loo
ParetoSmooth.psis_smooth_tail!
ParetoSmooth.relative_eff

ParetoSmooth.Psis — Type

Psis{V<:AbstractVector{F},I<:Integer} where {F<:AbstractFloat}

A struct containing the results of Pareto-smoothed importance sampling.

Fields

weights: A vector of smoothed, truncated, and normalized importance sampling weights.
pareto_k: Estimates of the shape parameter $k$ of the generalized Pareto distribution.
ess: Estimated effective sample size for each LOO evaluation.
tail_len: Vector of tail lengths used for smoothing the generalized Pareto distribution.
dims: Named tuple of length 2 containing s (posterior sample size) and n (number of

observations).

ParetoSmooth._assume_one_chain — Method

Assume that all objects belong to a single chain if chain index is missing. Inform user.

ParetoSmooth._check_input_validity_psis — Method

Make sure all inputs to psis are valid.

ParetoSmooth._check_tail — Method

Check the tail to make sure a GPD fit is possible.

ParetoSmooth._convert_to_array — Method

Convert a matrix+chain_index representation to a 3d array representation to pass it off to the method for arrays.

ParetoSmooth._generate_r_eff — Method

Generate the relative effective sample size if not provided by the user.

ParetoSmooth.def_tail_length — Method

def_tail_length(log_ratios::AbstractVector, r_eff::AbstractFloat) -> tail_len::Integer

Define the tail length as in Vehtari et al. (2019).

ParetoSmooth.do_psis_i! — Method

do_psis_i!(is_ratios::AbstractVector{AbstractFloat}, tail_length::Integer) -> T

Do PSIS on a single vector, smoothing its tail values.

Arguments

is_ratios::AbstractVector{AbstractFloat}: A vector of importance sampling ratios,

scaled to have a maximum of 1.

Returns

T<:AbstractFloat: ξ, the shape parameter for the GPD; big numbers indicate thick tails.

Extended help

Additional information can be found in the LOO package from R.

ParetoSmooth.gpd_quantile — Method

gpd_quantile(p::T, k::T, sigma::T) where {T<:AbstractFloat} -> T

Compute the p quantile of the Generalized Pareto Distribution (GPD).

Arguments

p: A scalar between 0 and 1.
ξ: A scalar shape parameter.
σ: A scalar scale parameter.

Returns

A quantile of the Generalized Pareto Distribution.

ParetoSmooth.gpdfit — Method

gpdfit(
    sample::AbstractVector{T<:AbstractFloat}; 
    wip::Bool=true, 
    min_grid_pts::Integer=30, 
    sort_sample::Bool=false
) -> (ξ::T, σ::T)

Return a named list of estimates for the parameters ξ (shape) and σ (scale) of the generalized Pareto distribution (GPD), assuming the location parameter is 0.

Arguments

sample::AbstractVector: A numeric vector. The sample from which to estimate the parameters.
wip::Bool = true: Logical indicating whether to adjust ξ based on a weakly informative Gaussian prior centered on 0.5. Defaults to true.
min_grid_pts::Integer = 30: The minimum number of grid points used in the fitting algorithm. The actual number used is min_grid_pts + ⌊sqrt(length(sample))⌋.

Note

Estimation method taken from Zhang, J. and Stephens, M.A. (2009). The parameter ξ is the negative of $k$.

ParetoSmooth.loo — Method

function loo(args...; method=PsisLooMethod(), kwargs...) -> PsisLoo

Compute the approximate leave-one-out cross-validation score using the specified method.

Currently, this function only serves to call psis_loo, but this could change in the future. The default methods or return type may change without warning; thus, we recommend using psis_loo instead if reproducibility is required.

See also: psis_loo, PsisLoo.

ParetoSmooth.psis — Method

psis(
    log_ratios::AbstractArray{T:>AbstractFloat}, 
    r_eff; 
    source::String="mcmc", 
    log_weights::Bool=false
) -> Psis

Implements Pareto-smoothed importance sampling (PSIS).

Arguments

Positional Arguments

log_ratios::AbstractArray{T}: An array of importance ratios on the log scale (for

PSIS-LOO these are negative log-likelihood values). Indices must be ordered as [data, draw, chain]: log_ratios[1, 2, 3] should be the log-likelihood of the first data point, evaluated at the second iteration of the third chain. Chain indices can be left off if there is only a single chain, or if keyword argument chain_index is provided.

r_eff::AbstractArray{T}: An (optional) vector of relative effective sample sizes used

in ESS calculations. If left empty, calculated automatically using the FFTESS method from InferenceDiagnostics.jl. See relative_eff to calculate these values.

Keyword Arguments

chain_index::Vector{Integer}: An (optional) vector of integers indicating which chain

each sample belongs to.

source::String="mcmc": A string or symbol describing the source of the sample being

used. If "mcmc", adjusts ESS for autocorrelation. Otherwise, samples are assumed to be independent. Currently permitted values are ["mcmc", "vi", "other"].

log_weights::Bool=false: Return log weights, rather than the PSIS weights.

ParetoSmooth.psis_ess — Method

function psis_ess(
    weights::AbstractVector{T},
    r_eff::AbstractVector{T}
) -> AbstractVector{T}

Calculate the (approximate) effective sample size of a PSIS sample, using the correction in Vehtari et al. 2019.

Arguments

weights: A set of importance sampling weights derived from PSIS.
r_eff: The relative efficiency of the MCMC chains from which PSIS samples were derived. See ?relative_eff to calculate r_eff.

ParetoSmooth.psis_loo — Method

function psis_loo(log_likelihood::Array{Float} [, args...];
    source::String="mcmc" [, chain_index::Vector{Int}, kwargs...]
) -> PsisLoo

Use Pareto-Smoothed Importance Sampling to calculate the leave-one-out cross validation score.

Arguments

log_likelihood::Array: An array or matrix of log-likelihood values indexed as [data, step, chain]. The chain argument can be left off if chain_index is provided or if all posterior samples were drawn from a single chain.
args...: Positional arguments to be passed to psis.
chain_index::Vector: A vector of integers specifying which chain each iteration belongs to. For instance, chain_index[iteration] should return 2 if log_likelihood[:, step] belongs to the second chain.
kwargs...: Keyword arguments to be passed to psis.