Kullback-Leibler Divergences
Using the Distributions
type provided by Distributions.jl, the KullbackLeibler
method offers a convenient way of computing the Kullback-Leibler divergence between distributions. In several cases an analytical expression for the Kullback-Leibler divergence is known. These include: (univariate and multivariate) Normal, Cauchy, Exponential, Weibull and Gamma distributions.
Furthermore, for distributions over a one-dimensional domain where no analytic result is known, KullbackLeibler
rephrases the integral in terms of an ODE and employs an efficient integration scheme from the DifferentialEquations.jl suite. For multivariate distributions, Monte Carlo integration is used.
Examples of use:
KullbackLeibler(Cauchy(1.,2.4),Normal(-4,0.5),HyperCube([-100,100]);Carlo=false,tol=1e-12)
KullbackLeibler(MvNormal([0,2.5],diagm([1,4.])),MvTDist(1,[3,2],diagm([2.,3.])),HyperCube([[-50,50],[-50,50]]); N=Int(1e8))
In addition, it is of course also possible to input generic functions, whose positivity and normalization should be ensured by the user.
InformationGeometry.KullbackLeibler
— MethodKullbackLeibler(p::Function,q::Function,Domain::HyperCube=HyperCube([[-15,15]]); tol=2e-15, N::Int=Int(3e7), Carlo::Bool=(Domain.dim!=1))
Computes the Kullback-Leibler divergence between two probability distributions p
and q
over the Domain
. If Carlo=true
, this is done using a Monte Carlo Simulation with N
samples. If the Domain
is one-dimensional, the calculation is performed without Monte Carlo to a tolerance of ≈ tol
.
For example, the Kullback-Leibler divergence between a Cauchy distribution with $\mu=1$ and $s=2$ and a normal (i.e. Gaussian) distribution with $\mu=-4$ and $\sigma=1/2$ can be calculated via:
using InformationGeometry # hide
using LinearAlgebra, Distributions
KullbackLeibler(Cauchy(1.,2.),Normal(-4.,0.5),HyperCube([-100,100]); Carlo=false,tol=1e-12)
Specifically, the keyword arguments used here numerically compute the divergence over the domain $[-100,100]$ to an accuracy of $10^{-12}$.
The domain of the integral involved in the computation of the divergence is specified using the HyperCube
datatype, which stores a cuboid region in $N$ dimensions as a vector of intervals.
InformationGeometry.HyperCube
— TypeThe HyperCube
type has the fields vals::Vector{Vector}
, which stores the intervals which define the hypercube and dim::Int
, which gives the dimension. Overall it just offers a convenient and standardized way of passing domains for integration or plotting between functions without having to check that these domains are sensible every time. Examples for constructing HyperCube
s:
HyperCube([[1,3],[pi,2pi],[-500.0,100.0]])
HyperCube([[-1,1]])
HyperCube([-1,1])
HyperCube(LowerUpper([-1,-5],[0,-4]))
HyperCube(collect([-7,7.] for i in 1:3))
The HyperCube
type is closely related to the LowerUpper
type and they can be easily converted into each other. Examples of quantities that can be computed from and operations involving a HyperCube
object X
:
CubeVol(X)
TranslateCube(X,v::Vector)
CubeWidths(X)
Furthermore, the Kullback-Leibler divergence between multivariate distributions can be computed for example by
KullbackLeibler(MvNormal([0,2.5],diagm([1,4.])),MvTDist(1,[3,2],diagm([2.,3.])),HyperCube([[-50,50],[-50,50]]); N=Int(5e6))
where it now becomes necessary to employ Monte Carlo schemes. Specifically, the keyword argument N
now determines the number of points where the integrand is evaluated over the domain $[-50,50] \times [-50,50]$.
So far, importance sampling has not been implemented for the Monte Carlo integration. Instead, the domain is sampled uniformly.