The ExponentialFamilyDistribution Interface

This page describes the philosophy and design concepts behind the ExponentialFamilyDistribution interface. In a nutshell, the primary purpose of the ExponentialFamily package is to provide a generic interface for an ExponentialFamilyDistribution. It is beneficial to become familiar with the Wikipedia article on the exponential family before delving into the implementation details of this package.

Notation

In the context of the package, exponential family distributions are represented in the form:

\[f_X(x\mid\eta) = h(x) \cdot \exp\left[ \eta \cdot T(x) - A(\eta) \right]\]

Here:

  • h(x) is the base measure.
  • T(x) represents sufficient statistics.
  • A(η) stands for the log partition.
  • η denotes the natural parameters.

In the following discussion we also use the following convention

  • η corresponds to the distribution's natural parameters in the natural parameter space.
  • θ corresponds to the distribution's mean parameters in the mean parameter space.

ExponentialFamilyDistribution structure

ExponentialFamily.ExponentialFamilyDistributionType
ExponentialFamilyDistribution(::Type{T}, naturalparameters, conditioner, attributes)

ExponentialFamilyDistribution structure represents a generic exponential family distribution in natural parameterization. Type T can be either a distribution type (e.g. from the Distributions.jl package) or a variate type (e.g. Univariate). In the context of the package, exponential family distributions are represented in the form:

\[pₓ(x ∣ η) = h(x) ⋅ exp[ η ⋅ T(x) - A(η) ]\]

Here:

  • h(x) is the base measure.
  • T(x) represents sufficient statistics.
  • A(η) stands for the log partition.
  • η denotes the natural parameters.

For a given member of exponential family:

  • getattributes returns either nothing or ExponentialFamilyDistributionAttributes.
  • getbasemeasure returns a positive a valued function.
  • getsufficientstatistics returns an iterable of functions such as [x, x^2] or [x, logx].
  • getnaturalparameters returns an iterable holding the values of the natural parameters.
  • getlogpartition return a function that depends on the naturalparameters and it ensures that the distribution is normalized to 1.
  • getsupport returns the set that the distribution is defined over. Could be real numbers, positive integers, 3d cube etc. Use ither the operator or the insupport() function to check if a value belongs to the support.
Note

The attributes can be nothing. In which case the package will try to derive the corresponding attributes from the type T.

julia> ef = convert(ExponentialFamilyDistribution, Bernoulli(0.5))
ExponentialFamily(Bernoulli)

julia> getsufficientstatistics(ef)
(identity,)
julia> ef = convert(ExponentialFamilyDistribution, Laplace(1.0, 0.5))
ExponentialFamily(Laplace, conditioned on 1.0)

julia> logpdf(ef, 4.0)
-6.0

See also: getbasemeasure, getsufficientstatistics, getnaturalparameters, getlogpartition, getsupport

ExponentialFamily.ExponentialFamilyDistributionAttributesType
ExponentialFamilyDistributionAttributes(basemeasure, sufficientstatistics, logpartition, support)

A structure to represent the attributes of an exponential family member.

Fields

  • basemeasure::B: The basemeasure of the exponential family member.
  • sufficientstatistics::S: The sufficient statistics of the exponential family member.
  • logpartition::L: The log-partition (cumulant) of the exponential family member.
  • support::P: The support of the exponential family member.

See also: ExponentialFamilyDistribution, getbasemeasure, getsufficientstatistics, getlogpartition, getsupport

Distributions.logpdfMethod
logpdf(ef::ExponentialFamilyDistribution, x)

Evaluates and returns the log-density of the exponential family distribution for the input x.

Distributions.pdfMethod
pdf(ef::ExponentialFamilyDistribution, x)

Evaluates and returns the probability density function of the exponential family distribution for the input x.

Distributions.cdfMethod
cdf(ef::ExponentialFamilyDistribution{D}, x) where { D <: Distribution }

Evaluates and returns the cumulative distribution function of the exponential family distribution for the input x.

ExponentialFamily.getconditionerFunction
getconditioner(::ExponentialFamilyDistribution)

Returns either the conditioner of the exponential family distribution or nothing. conditioner is a fixed parameter that is used to ensure that the distribution belongs to the exponential family.

ExponentialFamily.isproperFunction
isproper(::ExponentialFamilyDistribution)

Checks if the object of type ExponentialFamilyDistribution is a proper distribution.

isproper([ space = NaturalParametersSpace() ], ::Type{T}, parameters, conditioner = nothing) where { T <: Distribution }

A specific verion of isproper defined particularly for distribution types from Distributions.jl package. Does not require an instance of the ExponentialFamilyDistribution and can be called directly with a specific distribution type instead. Optionally, accepts the space parameter, which defines the parameters space. For conditional exponential family distributions requires an extra conditioner argument.

See also: NaturalParametersSpace, MeanParametersSpace

ExponentialFamily.getbasemeasureFunction
getbasemeasure(::ExponentialFamilyDistribution)
getbasemeasure(::Type{ <: Distribution }, [ conditioner ])

Returns the base measure function of the exponential family distribution.

ExponentialFamily.getsufficientstatisticsFunction
getsufficientstatistics(::ExponentialFamilyDistribution)
getsufficientstatistics(::Type{ <: Distribution }, [ conditioner ])

Returns the list of sufficient statistics of the exponential family distribution.

ExponentialFamily.getlogpartitionFunction
getlogpartition(::ExponentialFamilyDistribution)
getlogpartition([ space ], ::Type{ <: Distribution }, [ conditioner ])

Returns the log partition function of the exponential family distribution.

ExponentialFamily.getgradlogpartitionFunction
getgradlogpartition([ space = NaturalParametersSpace() ], ::Type{T}, [ conditioner ]) where { T <: Distribution }

A specific verion of getgradlogpartition defined particularly for distribution types from Distributions.jl package. Does not require an instance of the ExponentialFamilyDistribution and can be called directly with a specific distribution type instead. Optionally, accepts the space parameter, which defines the parameters space. For conditional exponential family distributions requires an extra conditioner argument.

ExponentialFamily.getfisherinformationFunction
getfisherinformation(::ExponentialFamilyDistribution)
getfisherinformation([ space ], ::Type{ <: Distribution }, [ conditioner ])

Returns the function that computes the fisher information matrix of the exponential family distribution.

ExponentialFamily.sufficientstatisticsFunction
sufficientstatistics(::ExponentialFamilyDistribution)

Returns the computed values of sufficientstatistics of the exponential family distribution at the point x.

ExponentialFamily.logpartitionFunction
logpartition(::ExponentialFamilyDistribution, η)

Return the computed value of logpartition of the exponential family distribution at the point η. By default η = getnaturalparameters(ef).

See also: getlogpartition

ExponentialFamily.gradlogpartitionFunction
gradlogpartition(::ExponentialFamilyDistribution, η)

Return the computed value of gradlogpartition of the exponential family distribution at the point η. By default η = getnaturalparameters(ef).

See also: getgradlogpartition

ExponentialFamily.isbasemeasureconstantFunction
isbasemeasureconstant(something)

Returns either NonConstantBaseMeasure() or ConstantBaseMeasure() depending on if the base measure is a constant with respect to the natural parameters of something or not. By default the package assumes that any base measure in a form of the Function is not a constant. It, however, is not true for basemeasure that simply return a constant. In such cases the isbasemeasureconstant must have a specific method.

See also: getbasemeasure, basemeasure

ExponentialFamily._logpdfFunction
_logpdf(ef::ExponentialFamilyDistribution, x)

Evaluates and returns the log-density of the exponential family distribution for the input x.

This inner function dispatches to the appropriate version of _logpdf based on the types of x and ef, utilizing the check_logpdf function. The dispatch mechanism ensures that _logpdf correctly handles the input x, whether it is a single point or a container of points, according to the nature of the exponential family distribution and x.

For instance, with a Univariate distribution, _logpdf evaluates the log-density for a single point if x is a Number, and for a container of points if x is an AbstractVector.

Examples

Evaluate the log-density of a Gamma distribution at a single point:

using ExponentialFamily, Distributions;
gamma = convert(ExponentialFamilyDistribution, Gamma(1, 1))
ExponentialFamily._logpdf(gamma, 1.0)
# output
-1.0

Evaluate the log-density of a Gamma distribution at multiple points:

using ExponentialFamily, Distributions
gamma = convert(ExponentialFamilyDistribution, Gamma(1, 1))
ExponentialFamily._logpdf(gamma, [1, 2, 3])
# output
3-element Vector{Float64}:
 -1.0
 -2.0
 -3.0

For details on the dispatch mechanism of _logpdf, refer to the check_logpdf function.

See also: check_logpdf

ExponentialFamily.check_logpdfFunction
check_logpdf(variate_form, typeof(x), eltype(x), ef, x)

Determines an appropriate strategy of evaluation of _logpdf (PointBasedLogpdfCall or MapBasedLogpdfCall) to use based on the types of x and ef. This function employs a dispatch mechanism that adapts to the input x, whether it is a single point or a container of points, in accordance with the characteristics of the exponential family distribution (ef) and the variate form of x.

Strategies

  • For a Univariate distribution:

    • If x is a Number, _logpdf is invoked with PointBasedLogpdfCall().
    • If x is an AbstractVector containing Numbers, _logpdf is invoked with MapBasedLogpdfCall().
  • For a Multivariate distribution:

    • If x is an AbstractVector containing Numbers, _logpdf is invoked with PointBasedLogpdfCall().
    • If x is an AbstractVector containing AbstractVectors, _logpdf is invoked with MapBasedLogpdfCall().
    • If x is an AbstractMatrix containing Numbers, _logpdf is invoked with MapBasedLogpdfCall(), transforming x to eachcol(x).
  • For a Matrixvariate distribution:

    • If x is an AbstractMatrix containing Numbers, _logpdf is invoked with PointBasedLogpdfCall().
    • If x is an AbstractVector containing AbstractMatrixs, _logpdf is invoked with MapBasedLogpdfCall().

Examples

using ExponentialFamily
ExponentialFamily.check_logpdf(Univariate, typeof(1.0), eltype(1.0), Gamma(1, 1), 1.0)
# output
(ExponentialFamily.PointBasedLogpdfCall(), 1.0)
using ExponentialFamily
ExponentialFamily.check_logpdf(Univariate, typeof([1.0, 2.0, 3.0]), eltype([1.0, 2.0, 3.0]), Gamma(1, 1), [1.0, 2.0, 3.0])
# output
(ExponentialFamily.MapBasedLogpdfCall(), [1.0, 2.0, 3.0])

See also: _logpdf PointBasedLogpdfCall MapBasedLogpdfCall

ExponentialFamily.MapBasedLogpdfCallType

A trait object, signifying that the _logpdf method should treat it second argument as a container of points from the distrubution domain.

Interfacing with Distributions Defined in the Distributions.jl Package

The Distributions.jl package is a comprehensive library that defines a wide collection of standard distributions. The main objective of the Distributions package is to offer a unified interface for evaluating likelihoods of various distributions, along with convenient sampling routines from these distributions. The ExponentialFamily package provides a lightweight interface for a subset of the distributions defined in the Distributions package.

Conversion between Mean Parameters Space and Natural Parameters Space

The Distributions package introduces the params function, which allows the retrieval of parameters for different distributions. For example:

using Distributions, ExponentialFamily

distribution = Bernoulli(0.25)

tuple_of_θ = params(distribution)
(0.25,)

These parameters are typically defined in what's known as the mean parameters space. However, the ExponentialFamilyDistribution expects parameters to be in the natural parameters space. To facilitate conversion between these two representations, the ExponentialFamily package provides two structures:

To convert from the mean parameters space to the corresponding natural parameters space, you can use the following code:

tuple_of_η = MeanToNatural(Bernoulli)(tuple_of_θ)
(-1.0986122886681098,)

And to convert back:

tuple_of_θ = NaturalToMean(Bernoulli)(tuple_of_η)
(0.25,)

Alternatuvely, the following API is supported

map(MeanParametersSpace() => NaturalParametersSpace(), Bernoulli, tuple_of_θ)
(-1.0986122886681098,)
map(NaturalParametersSpace() => MeanParametersSpace(), Bernoulli, tuple_of_η)
(0.25,)

While the ExponentialFamily package employs the respective mappings where needed, it's also possible to call these functions manually. For instance, the generic implementation of the convert function between ExponentialFamilyDistribution and Distribution is built in terms of MeanToNatural and NaturalToMean. Moreover, the convert function performs checks to ensure that the provided parameters and conditioner are suitable for a specific distribution type.

ExponentialFamily.isproperMethod
isproper([ space = NaturalParametersSpace() ], ::Type{T}, parameters, conditioner = nothing) where { T <: Distribution }

A specific verion of isproper defined particularly for distribution types from Distributions.jl package. Does not require an instance of the ExponentialFamilyDistribution and can be called directly with a specific distribution type instead. Optionally, accepts the space parameter, which defines the parameters space. For conditional exponential family distributions requires an extra conditioner argument.

See also: NaturalParametersSpace, MeanParametersSpace

Note on the conditioned distributions

For the conditioned distributions, two additional functions separate_conditioner and join_conditioner are used to separate the conditioner and actual parameters returned from the Distributions.params function.

ExponentialFamily.separate_conditionerFunction
separate_conditioner(::Type{T}, params) where {T <: Distribution}

Separates the conditioner argument from params and returns a tuple of (conditioned_params, conditioner). By default returns (params, nothing) but can be overwritten for certain distributions.

julia> (cparams, conditioner) = ExponentialFamily.separate_conditioner(Laplace, (0.0, 1.0))
((1.0,), 0.0)

julia> params = ExponentialFamily.join_conditioner(Laplace, cparams, conditioner)
(0.0, 1.0)

julia> Laplace(params...) == Laplace(0.0, 1.0)
true

See also: ExponentialFamily.join_conditioner

ExponentialFamily.join_conditionerFunction
join_conditioner(::Type{T}, params, conditioner) where { T <: Distribution }

Joins the conditioner argument with the params and returns a tuple of joined params, such that it can be used in a constructor of the T distribution.

julia> (cparams, conditioner) = ExponentialFamily.separate_conditioner(Laplace, (0.0, 1.0))
((1.0,), 0.0)

julia> params = ExponentialFamily.join_conditioner(Laplace, cparams, conditioner)
(0.0, 1.0)

julia> Laplace(params...) == Laplace(0.0, 1.0)
true

See also: ExponentialFamily.separate_conditioner

For example, Laplace distribution defines the functions in the following way

# `params` are coming from the `Distribution.params(::Laplace)` and return (location, scale)
# The `location`, however is a fixed parameter in the exponential distribution representation of Laplace
# Hence, we return a tuple of tuple of actual parameter and the conditioner
function separate_conditioner(::Type{Laplace}, params)
    location, scale = params
    return ((scale, ), location)
end

# The `join_conditioner` must join the actual parameters and the conditioner in such a way, that it is compatible 
# with the `Laplace` structure from the `Distributions.jl`. In Laplace, the location parameter goes first.
function join_conditioner(::Type{Laplace}, cparams, conditioner) 
    (scale, ) = cparams
    location = conditioner
    return (location, scale)
end

In general, all functions defined for the ExponentialFamilyDistribution, such as getlogpartition or getbasemeasure accept an optional conditioner parameter, which is assumed to be nothing. Conditioned distribution implement the "conditioned" versions of such functions by explicitly requiring the conditioner parameter, e.g.

getsufficientstatistics(Laplace, 1.0) # explicit `conditioner = 1.0`
(ExponentialFamily.var"#422#423"{Float64}(1.0),)

Efficient packing of the natural parameters into a vectorized form

The ExponentialFamilyDistribution type stores its natural parameters in a vectorized, or packed, format. This is done for the sake of efficiency and to enhance compatibility with autodiff packages like ForwardDiff, which anticipate a single parameter vector. As a result, the tuple of natural parameters needs to be converted to its corresponding vectorized form and vice versa. To achieve this, the package provides the flatten_parameters, pack_parameters and unpack_parameters functions.

ExponentialFamily.flatten_parametersFunction
flatten_parameters(::Type{T}, params::Tuple)

This function returns the parameters of a distribution of type T in a flattened form without actually allocating the container.

ExponentialFamily.pack_parametersFunction
pack_parameters([ space ], ::Type{T}, params::Tuple)

This function returns the parameters of a distribution of type T in a vectorized (packed) form. For most of the distributions the packed versions are of the same structure in any parameters space. For some distributions, however, it is necessary to indicate the space of the packaged parameters.

julia> ExponentialFamily.pack_parameters((1, [2.0, 3.0], [4.0 5.0 6.0; 7.0 8.0 9.0]))
9-element Vector{Float64}:
 1.0
 2.0
 3.0
 4.0
 7.0
 5.0
 8.0
 6.0
 9.0
ExponentialFamily.unpack_parametersFunction
unpack_parameters([ space ], ::Type{T}, parameters)

This function "unpack" the vectorized form of the parameters in a tuple. For most of the distributions the packed parameters are of the same structure in any parameters space. For some distributions, however, it is necessary to indicate the space of the packaged parameters.

See also: MeanParametersSpace, NaturalParametersSpace

These functions are not exported by default, but it's important to note that the ExponentialFamilyDistributions type doesn't actually store the parameter tuple internally. Instead, the getnaturalparameters function returns the corresponding vectorized (packed) form of the natural parameters. In general, only the ExponentialFamily.unpack_parameters function must be implemented, as others could be implemented in a generic way.

Attributes of the exponential family distribution based on Distribution

The ExponentialFamilyDistribution{T} where { T <: Distribution } type encompasses all fundamental attributes of the exponential family, including basemeasure, logpartition, sufficientstatistics, and fisherinformation. Furthermore, it's possible to retrieve the actual functions that compute these attributes. For instance, consider the following example:

basemeasure_of_bernoilli = getbasemeasure(Bernoulli)

basemeasure_of_bernoilli(0)
1
ExponentialFamily.isproperMethod
isproper([ space = NaturalParametersSpace() ], ::Type{T}, parameters, conditioner = nothing) where { T <: Distribution }

A specific verion of isproper defined particularly for distribution types from Distributions.jl package. Does not require an instance of the ExponentialFamilyDistribution and can be called directly with a specific distribution type instead. Optionally, accepts the space parameter, which defines the parameters space. For conditional exponential family distributions requires an extra conditioner argument.

See also: NaturalParametersSpace, MeanParametersSpace

ExponentialFamily.getbasemeasureMethod
getbasemeasure(::Type{<:Distribution}, [ conditioner ])

A specific verion of getbasemeasure defined particularly for distribution types from Distributions.jl package. Does not require an instance of the ExponentialFamilyDistribution and can be called directly with a specific distribution type instead. For conditional exponential family distributions requires an extra conditioner argument.

ExponentialFamily.getsufficientstatisticsMethod
getsufficientstatistics(::Type{<:Distribution}, [ conditioner ])

A specific verion of getsufficientstatistics defined particularly for distribution types from Distributions.jl package. Does not require an instance of the ExponentialFamilyDistribution and can be called directly with a specific distribution type instead. For conditional exponential family distributions requires an extra conditioner argument.

ExponentialFamily.getlogpartitionMethod
getlogpartition([ space = NaturalParametersSpace() ], ::Type{T}, [ conditioner ]) where { T <: Distribution }

A specific verion of getlogpartition defined particularly for distribution types from Distributions.jl package. Does not require an instance of the ExponentialFamilyDistribution and can be called directly with a specific distribution type instead. Optionally, accepts the space parameter, which defines the parameters space. For conditional exponential family distributions requires an extra conditioner argument.

See also: NaturalParametersSpace, MeanParametersSpace

ExponentialFamily.getfisherinformationMethod
getfisherinformation([ space = NaturalParametersSpace() ], ::Type{T}) where { T <: Distribution }

A specific verion of getfisherinformation defined particularly for distribution types from Distributions.jl package. Does not require an instance of the ExponentialFamilyDistribution and can be called directly with a specific distribution type instead. Optionally, accepts the space parameter, which defines the parameters space. For conditional exponential family distributions requires an extra conditioner argument.

See also: NaturalParametersSpace, MeanParametersSpace

Certain functions require knowledge about which parameter space is being used. By default, the NaturalParametersSpace is assumed.

getlogpartition(Bernoulli) === getlogpartition(NaturalParametersSpace(), Bernoulli)
true
ExponentialFamily.NaturalParametersSpaceType
NaturalParametersSpace

Specifies the natural parameters space η as the desired parameters space. Some functions (such as logpartition or fisherinformation) accept an additional space parameter to disambiguate the desired parameters space. Use map(NaturalParametersSpace() => MeanParametersSpace(), T, parameters, conditioner) to map the parameters and the conditioner of a distribution of type T from the natural parametrization to the corresponding mean parametrization.

See also: MeanParametersSpace, getmapping, NaturalToMean, MeanToNatural

ExponentialFamily.MeanParametersSpaceType
MeanParametersSpace

Specifies the mean parameters space θ as the desired parameters space. Some functions (such as logpartition or fisherinformation) accept an additional space parameter to disambiguate the desired parameters space. Use map(MeanParametersSpace() => NaturalParametersSpace(), T, parameters, conditioner) to map the parameters and the conditioner of a distribution of type T from the mean parametrization to the corresponding natural parametrization.

See also: NaturalParametersSpace, getmapping, NaturalToMean, MeanToNatural

The isbasemeasureconstant function is defined for all supported distributions as well.

isbasemeasureconstant(Bernoulli)
ConstantBaseMeasure()

Extra defined distributions

The package defines a list of extra distributions for a purpose of more efficiency in different circumstances. The list is available here.