Confidence Regions
InformationGeometry.DataSet
— TypeThe DataSet
type is a container for data points. It holds 3 vectors x
, y
, sigma
where the components of sigma
quantify the standard deviation associated with each measurement. For example,
DS = DataSet([1,2,3.],[4,5,6.5],[0.5,0.45,0.6])
Its fields can be obtained via xdata(DS)
, ydata(DS)
, sigma(DS)
.
InformationGeometry.DataModel
— TypeIn addition to a DataSet
, a DataModel
contains the model as a function model(x,θ)
and its derivative dmodel(x,θ)
where x
denotes the x-value of the data and θ
is a vector of parameters on which the model depends. Crucially, dmodel
contains the derivatives of the model with respect to the parameters θ
, not the x-values. For example
DS = DataSet([1,2,3.],[4,5,6.5],[0.5,0.45,0.6])
model(x,θ::Vector) = θ[1] .* x .+ θ[2]
DM = DataModel(DS,model)
If provided like this, the gradient of the model with respect to the parameters θ
(i.e. its "Jacobian") will be calculated using automatic differentiation. Alternatively, an explicit analytic expression for the Jacobian can be specified by hand:
function dmodel(x,θ::Vector)
J = Array{Float64}(undef, length(x), length(θ))
@. J[:,1] = x # ∂(model)/∂θ₁
@. J[:,2] = 1. # ∂(model)/∂θ₂
return J
end
DM = DataModel(DS,model,dmodel)
The output of the Jacobian must be a matrix whose columns correspond to the partial derivatives with respect to different components of θ
and whose rows correspond to evaluations at different values of x
.
StatsBase.loglikelihood
— Methodloglikelihood(DM::DataModel, θ::Vector)
Calculates the logarithm of the likelihood $\ell(\mathrm{data} \, | \, \theta) \coloneqq \mathrm{ln} \big( L(\mathrm{data} \, | \, \theta) \big)$ given a DataModel
and a parameter configuration $\theta$.
Missing docstring for FindMLE(::DataModel)
. Check Documenter's build log for details.
using InformationGeometry, Plots; gr() # hide
DS = DataSet([1,2,3.],[4,5,6.5],[0.5,0.45,0.6])
model(x,θ) = θ[1] .* x .+ θ[2]
DM = DataModel(DS,model)
MLE = FindMLE(DM)
Depending on how the parameters $\theta$ enter into the model, the shapes of confidence regions associated with the model may be distorted. For the linearly parametrized model shown above, the $1 \sigma$ and $2 \sigma$ confidence regions form perfect hyperellipses as expected:
sols = GenerateMultipleIntervals(DM,1:2)
VisualizeSols(sols)
# plot(sols[1],vars=(1,2),label="1σ CR",title="Confidence Regions for linearly parametrized model", xlabel="θ[1]", ylabel="θ[2]") # hide
# plot!(sols[2],vars=(1,2),label="2σ CR") # hide
# scatter!([MLE[1]],[MLE[2]],marker=:c,label="MLE") # hide
# savefig("../assets/sols.svg"); nothing # hide
For a non-linearly parametrized model, the confidence regions are found to be non-ellipsoidal:
model2(x,θ) = θ[1]^3 .* x .+ exp(θ[1] + θ[2])
DM2 = DataModel(DS,model2)
sols2 = GenerateMultipleIntervals(DM2,1:2)
VisualizeSols(sols2)
#plot(sols2[1],vars=(1,2),label="1σ CR",title="Confidence Regions for non-linearly parametrized model", xlabel="θ[1]", ylabel="θ[2]") # hide
#plot!(sols2[2],vars=(1,2),label="2σ CR") # hide
#MLE2 = FindMLE(DM2); scatter!([MLE2[1]],[MLE2[2]],marker=:c,label="MLE") # hide
#savefig("../assets/sols2.svg"); nothing # hide
Specifically in the case of two-dimensional parameter spaces as shown here, the problem of finding the exact boundaries of the confidence regions is turned into a system of ordinary differential equations and subsequently solved using the DifferentialEquations.jl suite. As a result, the boundaries of the confidence regions are obtained in the form of ODESolution
objects, which come equipped with elaborate interpolation methods.
Since both finding and visualizing exact confidence regions for models depending on more than two parameters (i.e. $\mathrm{dim} \, \mathcal{M} > 2$) is more challenging from a technical perspective, the above methods only work for $\mathrm{dim} \, \mathcal{M} = 2$ at this point in time. However, methods which allow for visualizations of confidence regions in arbitrary three-dimensional subspaces of parameter manifolds of any dimension are close to being finished and will follow soon.
Various geometric quantities which are intrinsic to the parameter manifold $\mathcal{M}$ can be computed as a result of the Fisher metric $g$ (and subsequent choice of the Levi-Civita connection) such as the Riemann and Ricci tensors and the Ricci scalar $R$.
InformationGeometry.FisherMetric
— MethodFisherMetric(DM::DataModel, θ::Vector{<:Number})
Computes the Fisher metric $g$ given a DataModel
and a parameter configuration $\theta$ under the assumption that the likelihood $L(\mathrm{data} \, | \, \theta)$ is a multivariate normal distribution.
InformationGeometry.GeometricDensity
— MethodGeometricDensity(DM::DataModel, θ::Vector)
Computes the square root of the determinant of the Fisher metric $\sqrt{\mathrm{det}\big(g(\theta)\big)}$ at the point $\theta$.
InformationGeometry.ChristoffelSymbol
— MethodChristoffelSymbol(DM::DataModel, θ::Vector; BigCalc::Bool=false)
ChristoffelSymbol(Metric::Function, θ::Vector; BigCalc::Bool=false)
Calculates the components of the (1,2) Christoffel symbol $\Gamma$ at a point $\theta$ (i.e. the Christoffel symbol "of the second kind") through finite differencing of the Metric
. Accurate to ≈ 3e-11. BigCalc=true
increases accuracy through BigFloat calculation.
InformationGeometry.Riemann
— MethodRiemann(DM::DataModel, θ::Vector; BigCalc::Bool=false)
Riemann(Metric::Function, θ::Vector; BigCalc::Bool=false)
Calculates the components of the (1,3) Riemann tensor by finite differencing of the Metric
. BigCalc=true
increases accuracy through BigFloat calculation.
InformationGeometry.Ricci
— MethodRicci(DM::DataModel, θ::Vector; BigCalc::Bool=false)
Ricci(Metric::Function, θ::Vector; BigCalc::Bool=false)
Calculates the components of the (0,2) Ricci tensor by finite differencing of Metric
. BigCalc=true
increases accuracy through BigFloat calculation.
InformationGeometry.RicciScalar
— MethodRicciScalar(DM::DataModel, θ::Vector; BigCalc::Bool=false)
RicciScalar(Metric::Function, θ::Vector; BigCalc::Bool=false)
Calculates the Ricci scalar by finite differencing of the Metric
. BigCalc=true
increases accuracy through BigFloat
calculation.
Further, studying the geodesics / autoparallels on a manifold can yield enlightening insights about its geometry.
InformationGeometry.Score
— MethodScore(DM::DataModel, θ::Vector{<:Number}; Auto::Bool=false)
Calculates the gradient of the log-likelihood with respect to a set of parameters p
. Auto=true
uses automatic differentiation.
InformationGeometry.AIC
— MethodAIC(DM::DataModel, θ::Vector)
Calculates the Akaike Information Criterion given a parameter configuration $\theta$ defined by $\mathrm{AIC} = -2 \, \ell(\mathrm{data} \, | \, \theta) + 2 \, \mathrm{length}(\theta)$.
InformationGeometry.GeodesicDistance
— MethodGeodesicDistance(DM::DataModel,P::Vector{<:Real},Q::Vector{<:Real}; tol::Real=1e-10)
GeodesicDistance(Metric::Function,P::Vector{<:Real},Q::Vector{<:Real}; tol::Real=1e-10)
Computes the length of a geodesic connecting the points P
and Q
.
To be continued...