Kullback-Leibler Divergences
Using the Distributions
type provided by Distributions.jl, the KullbackLeibler
method offers a convenient way of computing the Kullback-Leibler divergence between distributions. In several cases an analytical expression for the Kullback-Leibler divergence is known. These include: (univariate and multivariate) Normal, Cauchy, Exponential, Weibull and Gamma distributions.
Furthermore, for distributions over a one-dimensional domain where no analytic result is known, KullbackLeibler
rephrases the integral in terms of an ODE and employs an efficient integration scheme from the DifferentialEquations.jl suite. For multivariate distributions, Monte Carlo integration is used.
Examples of use:
KullbackLeibler(Cauchy(1.,2.4), Normal(-4,0.5), HyperCube([-100,100]); tol=1e-12)
KullbackLeibler(MvNormal([0,2.5],diagm([1,4.])), MvTDist(1,[3,2],diagm([2.,3.])), HyperCube([[-50,50],[-50,50]]); Carlo=true, N=Int(1e8))
In addition, it is of course also possible to input generic functions, whose positivity and normalization should be ensured by the user.
InformationGeometry.KullbackLeibler
— MethodKullbackLeibler(p::Function,q::Function,Domain::HyperCube=HyperCube([-15,15]); tol=2e-15, N::Int=Int(3e7), Carlo::Bool=(length(Domain)!=1))
Computes the Kullback-Leibler divergence between two probability distributions p
and q
over the Domain
. If Carlo=true
, this is done using a Monte Carlo Simulation with N
samples. If the Domain
is one-dimensional, the calculation is performed without Monte Carlo to a tolerance of ≈ tol
.
For example, the Kullback-Leibler divergence between a Cauchy distribution with $\mu=1$ and $s=2$ and a normal (i.e. Gaussian) distribution with $\mu=-4$ and $\sigma=1/2$ can be calculated via:
using InformationGeometry # hide
using LinearAlgebra, Distributions
KullbackLeibler(Cauchy(1.,2.), Normal(-4.,0.5), HyperCube([-100,100]); tol=1e-12)
Specifically, the keyword arguments used here numerically compute the divergence over the domain $[-100,100]$ to a tolerance of $\approx 10^{-12}$.
The domain of the integral involved in the computation of the divergence is specified using the HyperCube
datatype, which stores a cuboid region in $N$ dimensions as a vector of intervals.
InformationGeometry.HyperCube
— TypeThe HyperCube
type is used to specify a cuboid region in the form of a cartesian product of $N$ real intervals, thereby offering a convenient way of passing domains for integration or plotting between functions. A HyperCube
object cube
type has two fields: cube.L
and cube.U
which are two vectors which respectively store the lower and upper boundaries of the real intervals in order. Examples for constructing HyperCube
s:
HyperCube([[1,3],[pi,2π],[-500,100]])
HyperCube([1,π,-500],[3,2π,100])
HyperCube([[-1,1]])
HyperCube([-1,1])
HyperCube(collect([-7,7.] for i in 1:3))
Examples of quantities that can be computed from and operations involving a HyperCube
object X
:
CubeVol(X)
TranslateCube(X,v::Vector)
CubeWidths(X)
Furthermore, the Kullback-Leibler divergence between multivariate distributions can be computed for example by
KullbackLeibler(MvNormal([0,2.5],diagm([1,4.])), MvTDist(1,[3,2],diagm([2.,3.])), HyperCube([[-20,50],[-20,50]]); tol=1e-8)
using adaptive integration methods from HCubature.jl. Alternatively, Monte Carlo integration can be employed by specifying Carlo=true
:
KullbackLeibler(MvNormal([0,2.5],diagm([1,4.])), MvTDist(1,[3,2],diagm([2.,3.])), HyperCube([[-50,50],[-50,50]]); Carlo=true, N=Int(5e6))
In addition, the keyword argument N
now determines the number of points where the integrand is evaluated over the given domain $[-50,50] \times [-50,50]$.
So far, importance sampling has not been implemented for the Monte Carlo integration. Instead, the domain is sampled uniformly.