Docstrings · InformationGeometry.jl

InformationGeometry.CompositeDataSet — Type

The CompositeDataSet type is a more elaborate (and typically less performant) container for storing data. Essentially, it splits observed data which has multiple y-components into separate data containers (e.g. of type DataSet), each of which corresponds to one of the components of the y-data. Crucially, each of the smaller data containers still shares the same "kind" of x-data, that is, the same xdim, units and so on, although they do not need to share the exact same particular x-data.

The main advantage of this approach is that it can be applied when there are missing y-components in some observations. A typical use case for CompositeDataSets are time series where multiple quantities are tracked but not every quantity is necessarily recorded at each time step. Example:

using DataFrames
t = [1,2,3,4]
y₁ = [2.5, 6, missing, 9];      y₂ = [missing, 5, 3.1, 1.4]
σ₁ = 0.3*ones(4);               σ₂ = [missing, 0.2, 0.1, 0.5]
df = DataFrame([t y₁ σ₁ y₂ σ])

xdim = 1;   ydim = 2
CompositeDataSet(df, xdim, ydim; xerrs=false, stripedYs=true)

The boolean-valued keywords stripedXs and stripedYs can be used to indicate to the constructor whether the values and corresponding $1\sigma$ uncertainties are given in alternating order, or whether the initial block of ydim many columns are the values and the second ydim many columns are the corresponding uncertainties. Also, xerrs=true can be used to indicate that the x-values also carry uncertainties. Basically all functions which can be called on other data containers such as DataSet have been specialized to also work with CompositeDataSets.

InformationGeometry.DataModel — Type

In addition to storing a DataSet, a DataModel also contains a function model(x,θ) and its derivative dmodel(x,θ) where x denotes the x-value of the data and θ is a vector of parameters on which the model depends. Crucially, dmodel contains the derivatives of the model with respect to the parameters θ, not the x-values. For example

DS = DataSet([1,2,3,4], [4,5,6.5,7.8], [0.5,0.45,0.6,0.8])
model(x::Number, θ::AbstractVector{<:Number}) = θ[1] * x + θ[2]
DM = DataModel(DS, model)

In cases where the output of the model has more than one component (i.e. ydim > 1), it is advisable to define the model function in such a way that it outputs static vectors using StaticArrays.jl for increased performance. For ydim = 1, InformationGeometry.jl expects the model to output a number instead of a vector with one component. In contrast, the parameter configuration θ must always be supplied as a vector (even if it only has a single component).

An initial guess for the maximum likelihood parameters can optionally be passed to the DataModel as a vector via

DM = DataModel(DS, model, [1.0,2.5])

During the construction of a DataModel process which includes the search for the maximum likelihood estimate $\theta_\text{MLE}$, multiple tests are run. If necessary, these tests can be skipped by appending true as the last argument in the constructor:

DM = DataModel(DS, model, [-Inf,π,1], true)

If a DataModel is constructed as shown in the above examples, the gradient of the model with respect to the parameters θ (i.e. its "Jacobian") will be calculated using automatic differentiation. Alternatively, an explicit analytic expression for the Jacobian can be specified by hand:

using StaticArrays
function dmodel(x::Number, θ::AbstractVector{<:Number})
   @SMatrix [x  1.]     # ∂(model)/∂θ₁ and ∂(model)/∂θ₂
end
DM = DataModel(DS, model, dmodel)

The output of the Jacobian must be a matrix whose columns correspond to the partial derivatives with respect to different components of θ and whose rows correspond to evaluations at different components of x. Again, although it is not strictly required, outputting the Jacobian in form of a static matrix is typically beneficial for the overall performance.

It is also possible to specify a (logarithmized) prior distribution on the parameter space to the DataModel constructor after the initial guess for the MLE. For example:

using Distributions
Dist = MvNormal(ones(2), [1 0; 0 3.])
LogPriorFn(θ) = logpdf(Dist, θ)
DM = DataModel(DS, model, [1.0,2.5], LogPriorFn)

The DataSet contained in a DataModel named DM can be accessed via Data(DM), whereas the model and its Jacobian can be used via Predictor(DM) and dPredictor(DM) respectively. The MLE and the value of the log-likelihood at the MLE are accessible via MLE(DM) and LogLikeMLE(DM). The logarithmized prior can be accessed via LogPrior(DM).

InformationGeometry.DataSet — Type

The DataSet type is a versatile container for storing data. Typically, it is constructed by passing it three vectors x, y, sigma where the components of sigma quantify the standard deviation associated with each y-value. Alternatively, a full covariance matrix can be supplied for the ydata instead of a vector of standard deviations. The contents of a DataSet DS can later be accessed via xdata(DS), ydata(DS), ysigma(DS).

Examples:

In the simplest case, where all data points are mutually independent and have a single $x$-component and a single $y$-component each, a DataSet consisting of four points can be constructed via

DataSet([1,2,3,4], [4,5,6.5,7.8], [0.5,0.45,0.6,0.8])

or alternatively by

using LinearAlgebra
DataSet([1,2,3,4], [4,5,6.5,7.8], Diagonal([0.5,0.45,0.6,0.8].^2))

where the diagonal covariance matrix in the second line is equivalent to the vector of standard deviations supplied in the first line.

For measurements with multiple components, it is also possible to enter them as a Matrix where the columns correspond to the respective components.

DataSet([0, 0.5, 1], [1 100; 2 103; 3 108], [0.5 8; 0.4 5; 0.6 10])

Note that if the uncertainty matrix is square, it may be falsely interpreted as a covariance matrix instead of as the columnwise specification of standard deviations.

More generally, if a dataset consists of $N$ points where each $x$-value has $n$ many components and each $y$-value has $m$ many components, this can be specified to the DataSet constructor via a tuple $(N,n,m)$ in addition to the vectors x, y and the covariance matrix. For example:

X = [0.9, 1.0, 1.1, 1.9, 2.0, 2.1, 2.9, 3.0, 3.1, 3.9, 4.0, 4.1]
Y = [1.0, 5.0, 4.0, 8.0, 9.0, 13.0, 16.0, 20.0]
Cov = Diagonal([2.0, 4.0, 2.0, 4.0, 2.0, 4.0, 2.0, 4.0])
dims = (4, 3, 2)
DS = DataSet(X, Y, Cov, dims)

In this case, X is a vector consisting of the concatenated x-values (with 3 components each) for 4 different data points. The values of Y are the corresponding concatenated y-values (with 2 components each) of said 4 data points. Clearly, the covariance matrix must therefore be a positive-definite $(m \cdot N) \times (m \cdot N)$ matrix.

InformationGeometry.HyperCube — Type

The HyperCube type is used to specify a cuboid region in the form of a cartesian product of $N$ real intervals, thereby offering a convenient way of passing domains for integration or plotting between functions. A HyperCube object cube type has two fields: cube.L and cube.U which are two vectors which respectively store the lower and upper boundaries of the real intervals in order. Examples for constructing HyperCubes:

HyperCube([[1,3],[π,2π],[-500,100]])
HyperCube([1,π,-500],[3,2π,100])
HyperCube([[-1,1]])
HyperCube([-1,1])
HyperCube(collect([-7,7.] for i in 1:3))

Examples of quantities that can be computed from and operations involving a HyperCube object X:

CubeVol(X)
TranslateCube(X,v::Vector)
CubeWidths(X)

InformationGeometry.ModelMap — Type

Container for model functions which carries additional information, e.g. about the parameter domain on which it is valid.

InformationGeometry.Plane — Type

Plane(P::AbstractVector, Vx::AbstractVector, Vy::AbstractVector)

Specifies a 2D plane in the so-called parameter form using 3 vectors. Here the first argument P is a vector on the plane, the two vectors Vx and Vy are two other vectors, which span the plane and should ideally be orthogonal.

Base.in — Method

in(Cube::HyperCube, p::AbstractVector{<:Number}) -> Bool

Checks whether a point p lies inside Cube.

Base.intersect — Method

intersect(A::HyperCube, B::HyperCube) -> HyperCube
intersect(Cubes::Vector{<:HyperCube}) -> HyperCube

Returns new HyperCube which is the intersection of the given HyperCubes.

Base.union — Method

union(A::HyperCube, B::HyperCube) -> HyperCube
union(Cubes::Vector{<:HyperCube}) -> HyperCube

Returns new HyperCube which contains both given HyperCubes. That is, the returned cube is strictly speaking not the union, but a cover (which contains the union).

InformationGeometry.AIC — Function

AIC(DM::DataModel, θ::AbstractVector) -> Real

Calculates the Akaike Information Criterion given a parameter configuration $\theta$ defined by $\mathrm{AIC} = 2 \, \mathrm{length}(\theta) -2 \, \ell(\mathrm{data} \, | \, \theta)$. Lower values for the AIC indicate that the associated model function is more likely to be correct. For linearly parametrized models and small sample sizes, it is advisable to instead use the AICc which is more accurate.

InformationGeometry.AICc — Function

AICc(DM::DataModel, θ::AbstractVector) -> Real

Computes Akaike Information Criterion with an added correction term that prevents the AIC from selecting models with too many parameters (i.e. overfitting) in the case of small sample sizes. $\mathrm{AICc} = \mathrm{AIC} + \frac{2\mathrm{length}(\theta)^2 + 2 \mathrm{length}(\theta)}{N - \mathrm{length}(\theta) - 1}$ where $N$ is the number of data points. Whereas AIC constitutes a first order estimate of the information loss, the AICc constitutes a second order estimate. However, this particular correction term assumes that the model is linearly parametrized.

InformationGeometry.ApproxConfidenceBands — Function

ApproxConfidenceBands(DM::AbstractDataModel, ParameterCube::HyperCube, Xdomain=XCube(DM); N::Int=300, plot::Bool=true)

Computes confidence bands associated with the face centers of the ParameterCube. If the ParameterCube circumscribes a given confidence region, this will typically result in a gross and asymmetric overestimation of the true pointwise confidence bands associated with this confidence level.

InformationGeometry.ApproxConfidenceBands — Function

ApproxConfidenceBands(DM::AbstractDataModel, Confnum::Real, Xdomain=XCube(DM); N::Int=300, plot::Bool=true, add::Real=1.5)

InformationGeometry.ApproxInRegion — Method

ApproxInRegion(sol::ODESolution, p::AbstractVector{<:Number}) -> Bool

Blazingly fast approximative test whether a point lies within the polygon defined by the base points of a 2D ODESolution. Especially well-suited for hypothesis testing once a confidence boundary has been explicitly computed.

InformationGeometry.BIC — Function

BIC(DM::DataModel, θ::AbstractVector) -> Real

Calculates the Bayesian Information Criterion given a parameter configuration $\theta$ defined by $\mathrm{BIC} = \mathrm{ln}(N) \cdot \mathrm{length}(\theta) -2 \, \ell(\mathrm{data} \, | \, \theta)$ where $N$ is the number of data points.

InformationGeometry.BasisVector — Method

BasisVector(Slot::Int, dims::Int) -> Vector{Float64}

Computes a standard basis vector of length dims, i.e. whose components are all zero except for the component Slot, which has a value of one.

InformationGeometry.BlockMatrix — Method

BlockMatrix(A::AbstractMatrix, B::AbstractMatrix)

Constructs blockdiagonal matrix from A and B.

InformationGeometry.BlockMatrix — Method

BlockMatrix(M::AbstractMatrix, N::Int)

Returns matrix which contains N many blocks of the matrix M along its diagonal.

InformationGeometry.Center — Method

Center(Cube::HyperCube) |> Vector

Returns center of mass of Cube.

InformationGeometry.ChristoffelSymbol — Method

ChristoffelSymbol(DM::DataModel, θ::AbstractVector; BigCalc::Bool=false)
ChristoffelSymbol(Metric::Function, θ::AbstractVector; BigCalc::Bool=false)

Calculates the components of the $(1,2)$ Christoffel symbol $\Gamma$ at a point $\theta$ (i.e. the Christoffel symbol "of the second kind") through finite differencing of the Metric. Accurate to ≈ 3e-11. BigCalc=true increases accuracy through BigFloat calculation.

InformationGeometry.ComputeGeodesic — Method

ComputeGeodesic(DM::DataModel, InitialPos::Vector, InitialVel::Vector, Endtime::Number=50.;
                                Boundaries::Union{Function,Nothing}=nothing, tol::Real=1e-11, meth=Tsit5())

Constructs geodesic with given initial position and velocity. It is possible to specify a boolean-valued function Boundaries(u,t,int), which terminates the integration process when it returns true.

InformationGeometry.ConcatenateModels — Method

Only works for DataSet and DataSetExact but will output wrong order of components for CompositeDataSet!

InformationGeometry.ConfAlpha — Method

ConfAlpha(n::Real)

Probability volume outside of a confidence interval of level n⋅σ where σ is the standard deviation of a normal distribution.

InformationGeometry.ConfVol — Method

ConfVol(n::Real)

Probability volume contained in a confidence interval of level n⋅σ where σ is the standard deviation of a normal distribution.

InformationGeometry.ConfidenceBandWidth — Method

Computes width of confidence bands.

InformationGeometry.ConfidenceBands — Function

ConfidenceBands(DM::DataModel, sol::AbstractODESolution, Xdomain::HyperCube; N::Int=300, plot::Bool=true) -> Matrix

Given a confidence interval sol, the pointwise confidence band around the model prediction is computed for x values in Xdomain by evaluating the model on the boundary of the confidence region.

InformationGeometry.ConfidenceInterval1D — Function

ConfidenceInterval1D(DM::AbstractDataModel, Confnum::Real=1.; tol::Real=1e-14) -> Tuple{Number,Number}

Returns the confidence interval associated with confidence level Confnum in the case of one-dimensional parameter spaces.

InformationGeometry.ConfidenceRegion — Function

ConfidenceRegion(DM::AbstractDataModel, Confnum::Real=1.; tol::Real=1e-9, meth=Tsit5(), mfd::Bool=false, Auto::Val=Val(false), parallel::Bool=false, Dirs::Tuple{Int,Int,Int}=(1,2,3), N::Int=30)

Computes confidence region of level Confnum. For pdim(DM) > 2, the confidence region is intersected by a family of Planes in the directions specified by the keyword Dirs. The Planes and their embedded 2D confidence boundaries are returned as the respective first and second arguments in this case.

InformationGeometry.ConfidenceRegionVolume — Method

ConfidenceRegionVolume(DM::AbstractDataModel, Confnum::Real; N::Int=Int(1e5), WE::Bool=true, Approx::Bool=false, kwargs...) -> Real

Computes coordinate-invariant volume of confidence region associated with level Confnum via Monte Carlo by integrating the geometric density factor. For likelihoods which are particularly expensive to evaluate, Approx=true can improve the performance by approximating the confidence region via polygons.

InformationGeometry.ConfidenceRegions — Function

ConfidenceRegions(DM::DataModel, Range::AbstractVector)

Computes the boundaries of confidence regions for two-dimensional parameter spaces given a vector or range of confidence levels. A convenient interface which extends this to higher dimensions is currently still under development.

For example,

ConfidenceRegions(DM, 1:3; tol=1e-9)

computes the $1\sigma$, $2\sigma$ and $3\sigma$ confidence regions associated with a given DataModel using a solver tolerance of $10^{-9}$.

Keyword arguments:

IsConfVol = true can be used to specify the desired confidence level directly in terms of a probability $p \in [0,1]$ instead of in units of standard deviations $\sigma$,
tol can be used to quantify the tolerance with which the ODE which defines the confidence boundary is solved (default tol = 1e-12),
meth can be used to specify the solver algorithm (default meth = Tsit5()),
Auto = Val(true) can be chosen to compute the derivatives of the likelihood using automatic differentiation,
parallel = true parallelizes the computations of the separate confidence regions provided each process has access to the necessary objects,
dof can be used to manually specify the degrees of freedom.

InformationGeometry.ConstructCube — Method

ConstructCube(M::Matrix{<:Number}; Padding::Number=1/50) -> HyperCube

Returns a HyperCube which encloses the extrema of the columns of the input matrix.

InformationGeometry.CoordinateDistortion — Function

CoordinateDistortion(DM::AbstractDataModel, Confnum::Real=1) -> Real

For CoordinateDistortions ≪ 1, the model predictions are extremely sensitive with respect to the parameters. For CoordinateDistortion ⪎ 1, the model is comparatively insensitive towards the parameters.

This quantity is computed from the ratio of the coordinate-dependent apparent volume of a confidence region compared with the coordinate-invariant volume, which is obtained from integrating over the appropriate volume form / geometric density factor. The unit of this quantity is $[L^n]$ where $L$ is the unit of length of each of the components.

InformationGeometry.CoordinateVolume — Method

CoordinateVolume(DM::AbstractDataModel, Confnum::Real; N::Int=Int(1e5), WE::Bool=true, Approx::Bool=false, kwargs...) -> Real

Computes coordinate-dependent apparent volume of confidence region associated with level Confnum via Monte Carlo integration. For likelihoods which are particularly expensive to evaluate, Approx=true can improve the performance by approximating the confidence region via polygons.

InformationGeometry.CreateMesh — Method

CreateMesh(Planes::Vector{<:Plane}, Sols::Vector{<:AbstractODESolution}; N::Int=2*length(Sols), rectangular::Bool=true) -> (Matrix{Float64}, Matrix{Int64})

Returns a N×3 matrix whose rows correspond to the coordinates of various points in 3D space as the first argument. The second Matrix is either N×4 or N×3 depending on the value of rectangular and enumerates the points which are to be connected up to a rectangular or triangular face in counter-clockwise fashion. The indices of the points correspond to the lines in the first Matrix.

InformationGeometry.CubeVol — Method

CubeVol(Cube::HyperCube) -> Real

Computes volume of a HyperCube as the product of its sidelengths.

InformationGeometry.CubeWidths — Method

CubeWidths(H::HyperCube) -> Vector

Returns vector of widths of the HyperCube.

InformationGeometry.Deplanarize — Method

Deplanarize(PL::Plane,sol::AbstractODESolution; N::Int=500) -> Matrix
Deplanarize(PL::Plane,sol::AbstractODESolution, Ts::AbstractVector{<:Number}) -> Matrix

Converts the 2D outputs of sol from planar coordinates associated with PL to the coordinates of the ambient space of PL.

InformationGeometry.DetermineDmodel — Function

DetermineDmodel(DS::AbstractDataSet, model::Function)::Function

Returns appropriate function which constitutes the automatic derivative of the model(x,θ) with respect to the parameters θ depending on the format of the x-values and y-values of the DataSet.

InformationGeometry.DistanceAlongGeodesic — Method

DistanceAlongGeodesic(Metric::Function,sol::AbstractODESolution,L::Number; tol=1e-14)

Calculates at which parameter value of the geodesic sol the length L is reached.

InformationGeometry.EmbedModelVia — Method

EmbedModelVia(model, F::Function; Domain::HyperCube=FullDomain(GetArgLength(F))) -> Union{Function,ModelMap}

Transforms a model function via newmodel(x, θ) = oldmodel(x, F(θ)). A Domain for the new model can optionally be specified for ModelMaps.

InformationGeometry.Embedding — Function

Embedding(DM::AbstractDataModel, F::Function, start::Vector; Domain::HyperCube=FullDomain(length(start))) -> DataModel

Transforms a model function via newmodel(x, θ) = oldmodel(x, F(θ)) and returns the associated DataModel. An initial parameter configuration start as well as a Domain can optionally be passed to the DataModel constructor.

InformationGeometry.EmbeddingMap — Function

EmbeddingMap(DM::AbstractDataModel, θ::AbstractVector{<:Number}) -> Vector

Returns a vector of the collective predictions of the model as evaluated at the x-values and the parameter configuration $\theta$.

h(\theta) \coloneqq \big(y_\mathrm{model}(x_1;\theta),...,y_\mathrm{model}(x_N;\theta)\big) \in \mathcal{D}

InformationGeometry.EmbeddingMatrix — Function

EmbeddingMatrix(DM::AbstractDataModel, θ::AbstractVector{<:Number}) -> Matrix

Returns the jacobian of the embedding map as evaluated at the x-values and the parameter configuration $\theta$.

InformationGeometry.EvaluateEach — Method

EvaluateEach(geos::Vector{<:AbstractODESolution}, Ts::Vector) -> Vector

Evalues a family geos of geodesics on a set of parameters Ts. geos[1] is evaluated at Ts[1], geos[2] is evaluated at Ts[2] and so on. The second half of the values respresenting the velocities is automatically truncated.

InformationGeometry.ExponentialMap — Method

ExponentialMap(Metric::Function, point::AbstractVector{<:Number}, tangent::AbstractVector{<:Number}; tol::Real=1e-9)

Computes the differential-geometric exponential map $\mathrm{exp}_p(v)$ which returns the endpoint that is reached by a geodesic $\gamma:[0,1] \longrightarrow \mathcal{M}$ with initial direction $v \in T_p \mathcal{M}$.

InformationGeometry.FaceCenters — Method

FaceCenters(Cube::HyperCube) -> Vector{Vector}

Returns a Vector of the 2n-many face centers of a n-dimensional Cube.

InformationGeometry.FindConfBoundaryOnPlane — Function

FindConfBoundaryOnPlane(DM::AbstractDataModel,PL::Plane,Confnum::Real=1.; tol::Real=1e-8) -> Union{Vector{Number},Bool}

Computes point inside the plane PL which lies on the boundary of a confidence region of level Confnum. If such a point cannot be found (i.e. does not seem to exist), the method returns false.

InformationGeometry.FisherMetric — Function

FisherMetric(DM::DataModel, θ::AbstractVector{<:Number})

Computes the Fisher metric $g$ given a DataModel and a parameter configuration $\theta$ under the assumption that the likelihood $L(\mathrm{data} \, | \, \theta)$ is a multivariate normal distribution.

\[g_{ab}(\theta) \coloneqq -\int_{\mathcal{D}} \mathrm{d}^m y_{\mathrm{data}} \, L(y_{\mathrm{data}} \,|\, \theta) \, \frac{\partial^2 \, \mathrm{ln}(L)}{\partial \theta^a \, \partial \theta^b} = -\mathbb{E} \bigg( \frac{\partial^2 \, \mathrm{ln}(L)}{\partial \theta^a \, \partial \theta^b} \bigg)\]

InformationGeometry.GenerateBoundary — Method

GenerateBoundary(DM::DataModel, u0::AbstractVector{<:Number}; tol::Real=1e-14, meth=Tsit5(), mfd::Bool=true) -> ODESolution

Basic method for constructing a curve lying on the confidence region associated with the initial configuration u0.

InformationGeometry.GenerateInterruptedBoundary — Method

GenerateInterruptedBoundary(DM::AbstractDataModel, u0::AbstractVector{<:Number}; Boundaries::Union{Function,Nothing}=nothing, tol::Real=1e-12,
                            redo::Bool=true, meth::OrdinaryDiffEqAlgorithm=Tsit5(), mfd::Bool=true, Auto::Val=Val(false), kwargs...) -> ODESolution

Integrates along the level lines of the log-likelihood in the counter-clockwise direction until the model becomes either

structurally identifiable via det(g) < tol
the given Boundaries(u,t,int) method evaluates to true.

It then integrates from where this obstruction was met in the clockwise direction until said obstruction is hit again, resulting in a half-open confidence region.

InformationGeometry.GeodesicBetween — Method

GeodesicBetween(DM::DataModel, P::AbstractVector{<:Number}, Q::AbstractVector{<:Number}; tol::Real=1e-10, meth=Tsit5())
GeodesicBetween(Metric::Function, P::AbstractVector{<:Number}, Q::AbstractVector{<:Number}; tol::Real=1e-10, meth=Tsit5())

Computes a geodesic between two given points on the parameter manifold and an expression for the metric.

InformationGeometry.GeodesicCrossing — Function

GeodesicCrossing(DM::DataModel, sol::AbstractODESolution, Conf::Real=ConfVol(1); tol=1e-15)

Gives the parameter value of the geodesic sol at which the confidence level Conf is crossed.

InformationGeometry.GeodesicDistance — Method

GeodesicDistance(DM::DataModel, P::AbstractVector{<:Number}, Q::AbstractVector{<:Number}; tol::Real=1e-10)
GeodesicDistance(Metric::Function, P::AbstractVector{<:Number}, Q::AbstractVector{<:Number}; tol::Real=1e-10)

Computes the length of a geodesic connecting the points P and Q.

InformationGeometry.GeodesicLength — Method

GeodesicLength(DM::DataModel,sol::AbstractODESolution, Endrange::Number=sol.t[end]; FullSol::Bool=false, tol=1e-14)
GeodesicLength(Metric::Function,sol::AbstractODESolution, Endrange::Number=sol.t[end]; FullSol::Bool=false, tol=1e-14)

Calculates the length of a geodesic sol using the Metric up to parameter value Endrange.

\[L[\gamma] \coloneqq \int_a^b \mathrm{d} t \, \sqrt{g_{\gamma(t)} \big(\dot{\gamma}(t), \dot{\gamma}(t)\big)}\]

InformationGeometry.GeometricDensity — Method

GeometricDensity(DM::AbstractDataModel, θ::AbstractVector) -> Real

Computes the square root of the determinant of the Fisher metric $\sqrt{\mathrm{det}\big(g(\theta)\big)}$ at the point $\theta$.

InformationGeometry.GetArgSize — Method

GetArgSize(model::ModelOrFunction; max::Int=100)::Tuple{Int,Int}

Returns tuple (xdim,pdim) associated with the method model(x,p).

InformationGeometry.GetDomainTuple — Method

Naively computes approximate 1D domain from inverse Fisher metric at MLE.

InformationGeometry.GetModel — Method

GetModel(func::ODEFunction, u0::AbstractArray, ObservationFunction::Function; tol::Real=1e-7, meth::OrdinaryDiffEqAlgorithm=Tsit5(), Domain::Union{HyperCube,Nothing}=nothing, inplace::Bool=true)

Returns a ModelMap which evolves the given system of ODEs from the initial configuration u0 and afterwards applies the ObservationFunction to produce its predictions.

A Domain can be supplied to constrain the parameters of the model to particular ranges which can be helpful in the fitting process.

InformationGeometry.GetModel — Method

GetModel(func::AbstractODEFunction{T}, SplitterFunction::Function, PreObservationFunction::Function; tol::Real=1e-7, meth::OrdinaryDiffEqAlgorithm=Tsit5(), Domain::Union{HyperCube,Nothing}=nothing, inplace::Bool=true)

Returns a ModelMap which evolves the given system of ODEs and afterwards applies the ObservationFunction to produce its predictions. Here, the initial conditions for the ODEs are produced from the parameters θ using the SplitterFunction which for instance allows one to estimate them from data.

SplitterFunction should be of the form F(θ) -> (u0, p), i.e. the output is a Tuple whose first entry is the initial condition for the ODE model and the second entry constitutes the parameters which go on to enter the ODEFunction. Typically, a fair bit of additional performance can be gained from ensuring that SplitterFunction outputs the initial condition u0 as type MVector or MArray, if it has less than ~100 components.

ObservationFunction should either be of the form F(u) -> Vector or F(u,t) -> Vector or F(u,t,θ) -> Vector. Internally, the ObservationFunction is automatically wrapped as F(u,t,θ) if it is not already defined to accept three arguments. NOTE: The vector θ passed to ObservationFunction is the same θ that is passed to SplitterFunction, i.e. before splitting. This is because ObservationFunction might also depend on the initial conditions in general.

A Domain can be supplied to constrain the parameters of the model to particular ranges which can be helpful in the fitting process.

InformationGeometry.GetModel — Method

GetModel(func::ODEFunction, SplitterFunction::Function, observables::Union{AbstractVector{<:Int},BoolArray}=collect(1:length(u0)); tol::Real=1e-7, meth::OrdinaryDiffEqAlgorithm=Tsit5(), Domain::Union{HyperCube,Nothing}=nothing, inplace::Bool=true)

Returns a ModelMap which evolves the given system of ODEs and returns u[observables] to produce its predictions. Here, the initial conditions for the ODEs are produced from the parameters θ using the SplitterFunction which for instance allows one to estimate them from data.

SplitterFunction should be of the form F(θ) -> (u0, p), i.e. the output is a Tuple whose first entry is the initial condition for the ODE model and the second entry constitutes the parameters which go on to enter the ODEFunction. Typically, a fair bit of performance can be gained from ensuring that SplitterFunction outputs the initial condition u0 as type MVector or MArray, if it has less than ~100 components.

A Domain can be supplied to constrain the parameters of the model to particular ranges which can be helpful in the fitting process.

InformationGeometry.GetProfile — Method

GetProfile(DM::AbstractDataModel, Comp::Int, dom::Tuple{<:Real, <:Real}; N::Int=50) -> N×2 Matrix

Computes profile likelihood associated with the component Comp of the parameters over the domain dom.

InformationGeometry.HyperPlane — Method

HyperPlane(basepoint, dir1, ...) -> Function

Returns an embedding function which translates points from HyperPlane coordinates to the ambient space.

InformationGeometry.InformNames — Method

InformNames(DS::AbstractDataSet, sys::ODESystem, observables::Vector{<:Int})

Copy the state names saved in ODESystem to DS.

InformationGeometry.Inside — Method

Inside(Cube::HyperCube, p::AbstractVector{<:Number}) -> Bool

Checks whether a point p lies inside Cube.

InformationGeometry.Integrate1D — Method

Integrate1D(F::Function, Cube::HyperCube; tol::Real=1e-14, FullSol::Bool=false, meth=nothing)

Integrates F over a one-dimensional domain specified via a HyperCube by rephrasing the integral as an ODE and using DifferentialEquations.jl.

InformationGeometry.IntegrateND — Method

IntegrateND(F::Function,Cube::HyperCube; tol::Real=1e-12, WE::Bool=false, kwargs...)

Integrates the function F over Cube with the help of HCubature.jl to a tolerance of tol. If WE=true, the result is returned as a Measurement which also contains the estimated error in the result.

InformationGeometry.IntegrateOverApproxConfidenceRegion — Method

IntegrateOverApproxConfidenceRegion(DM::AbstractDataModel, Domain::HyperCube, sol::AbstractODESolution, F::Function; N::Int=Int(1e5), WE::Bool=true, kwargs...)
IntegrateOverApproxConfidenceRegion(DM::AbstractDataModel, Domain::HyperCube, Planes::Vector{<:Plane}, sols::Vector{<:AbstractODESolution}, F::Function; N::Int=Int(1e5), WE::Bool=true, kwargs...)

Integrates a function F over the intersection of Domain and the polygon defined by sol.

InformationGeometry.IntegrateOverConfidenceRegion — Method

IntegrateOverConfidenceRegion(DM::AbstractDataModel, Domain::HyperCube, Confnum::Real, F::Function; N::Int=Int(1e5), WE::Bool=true, kwargs...)

Integrates a function F over the intersection of Domain and the confidence region of level Confnum.

InformationGeometry.InterpolatedProfiles — Method

InterpolatedProfiles(M::AbstractVector{<:AbstractMatrix}) -> Vector{Function}

Interpolates the Vector{Matrix} output of ProfileLikelihood() with cubic splines.

InformationGeometry.InterruptedConfidenceRegion — Method

InterruptedConfidenceRegion(DM::AbstractDataModel, Confnum::Real; Boundaries::Union{Function,Nothing}=nothing, tol::Real=1e-12,
                            redo::Bool=true, meth::OrdinaryDiffEqAlgorithm=Tsit5(), mfd::Bool=true, Auto::Val=Val(false), kwargs...) -> ODESolution

Integrates along the level lines of the log-likelihood in the counter-clockwise direction until the model becomes either

structurally identifiable via det(g) < tol
the given Boundaries(u,t,int) method evaluates to true.

It then integrates from where this obstruction was met in the clockwise direction until said obstruction is hit again, resulting in a half-open confidence region.

InformationGeometry.IntersectCube — Function

IntersectCube(DM::AbstractDataModel,Cube::HyperCube,Confnum::Real=1.; Dirs::Tuple{Int,Int,Int}=(1,2,3), N::Int=31) -> Vector{Plane}

Returns a set of parallel 2D planes which intersect Cube. The planes span the directions corresponding to the basis vectors corresponding to the first two components of Dirs. They are separated in the direction of the basis vector associated with the third component of Dirs. The keyword N can be used to approximately control the number of planes which are returned. This depends on whether more (or fewer) planes than N are necessary to cover the whole confidence region of level Confnum.

InformationGeometry.IntersectRegion — Function

IntersectRegion(DM::AbstractDataModel,PL::Plane,v::Vector{<:Number},Confnum::Real=1.; N::Int=31) -> Vector{Plane}

Translates family of N planes which are translated approximately from -v to +v and intersect the confidence region of level Confnum. If necessary, planes are removed or more planes added such that the maximal family of planes is found.

InformationGeometry.IsLinear — Method

IsLinear(DM::DataModel) -> Bool

Checks whether the model(x,θ) function is linear with respect to all of its parameters $\theta \in \mathcal{M}$. A componentwise check can be attained via the method IsLinearParameter(DM).

InformationGeometry.IsLinearParameter — Method

IsLinearParameter(DM::DataModel) -> BitVector

Checks with respect to which parameters the model function model(x,θ) is linear and returns vector of booleans where true indicates linearity. This test is performed by comparing the Jacobians of the model for two random configurations $\theta_1, \theta_2 \in \mathcal{M}$ column by column.

InformationGeometry.KullbackLeibler — Method

KullbackLeibler(p::Function,q::Function,Domain::HyperCube=HyperCube([-15,15]); tol=2e-15, N::Int=Int(3e7), Carlo::Bool=(length(Domain)!=1))

Computes the Kullback-Leibler divergence between two probability distributions p and q over the Domain. If Carlo=true, this is done using a Monte Carlo Simulation with N samples. If the Domain is one-dimensional, the calculation is performed without Monte Carlo to a tolerance of ≈ tol.

\[D_{\text{KL}}[p,q] \coloneqq \int \mathrm{d}^m y \, p(y) \, \mathrm{ln} \bigg( \frac{p(y)}{q(y)} \bigg)\]

InformationGeometry.LeastInformativeDirection — Function

LeastInformativeDirection(DM::DataModel,θ::AbstractVector{<:Number}=MLE(DM)) -> Vector{Float64}

Returns a vector which points in the direction in which the likelihood decreases most slowly.

InformationGeometry.LeftOfLine — Method

LeftOfLine(q₁::AbstractVector, q₂::AbstractVector, p::AbstractVector) -> Bool

Checks if point p is left of the line from q₁ to q₂ via det([q₁-p q₂-p]) for 2D points.

InformationGeometry.LineSearch — Function

LineSearch(Test::Function, start::Number=0.; tol::Real=8e-15, maxiter::Int=10000) -> Number

Finds real number x where the boolean-valued Test(x::Number) goes from true to false.

InformationGeometry.LinearCuboid — Function

LinearCuboid(DM::AbstractDataModel, Confnum::Real=1.; Padding::Number=1/30, N::Int=200) -> HyperCube

Returns HyperCube which bounds the linearized confidence region of level Confnum for a DataModel.

InformationGeometry.LogLikeMLE — Method

LogLikeMLE(DM::DataModel) -> Real

Returns the value of the log-likelihood $\ell$ when evaluated at the maximum likelihood estimate, i.e. $\ell(\mathrm{data} \, | \, \theta_\text{MLE})$. For performance reasons, this value is stored as a part of the DataModel type.

InformationGeometry.LogarithmicMap — Method

LogarithmicMap(Metric::Function, P::AbstractVector{<:Number}, Q::AbstractVector{<:Number}; tol::Real=1e-9)

Computes the inverse of the differential-geometric exponential map, i.e. $\mathrm{ln}_p(q) \equiv (\mathrm{exp}^{-1})_p(q)$ which returns a (possibly non-unique!) initial direction $v \in T_p \mathcal{M}$ for a geodesic $\gamma:[0,1] \longrightarrow \mathcal{M}$ that goes from $p \in \mathcal{M}$ to $q \in \mathcal{M}$.

InformationGeometry.MBAMBoundaries — Method

Return true when integration of ODE should be terminated.

InformationGeometry.MLE — Method

MLE(DM::DataModel) -> Vector

Returns the parameter configuration $\theta_\text{MLE} \in \mathcal{M}$ which is estimated to have the highest likelihood of producing the observed data (under the assumption that the specified model captures the true relationship present in the data). For performance reasons, the maximum likelihood estimate is stored as a part of the DataModel type.

InformationGeometry.MaximalNumberOfArguments — Method

MaximalNumberOfArguments(F::Function) -> Int

Infers argument structure of given function, i.e. whether it is of the form F(x) or F(x,y) or F(x,y,z) etc. and returns maximal number of accepted arguments of all overloads of F as integer.

InformationGeometry.MincedBoundaries — Function

MincedBoundaries(DM::AbstractDataModel, Planes::Vector{<:Plane}, Confnum::Real=1.; tol::Real=1e-9, Auto::Val=Val(false), meth=Tsit5(), mfd::Bool=false)

Intersects the confidence boundary of level Confnum with Planes and computes ODESolutions which parametrize this intersection.

InformationGeometry.MinimizeOnPlane — Function

MinimizeOnPlane(PL::Plane,F::Function,initial::AbstractVector=[1,-1.]; tol::Real=1e-5)

Minimizes given function in Plane and returns the optimal point in the ambient space in which the plane lies.

InformationGeometry.ModelComparison — Method

ModelComparison(DM1::AbstractDataModel, DM2::AbstractDataModel) -> Tuple{Int,Real}

Compares the AICc values of both models at best fit and estimates probability that one model is more likely than the other. First entry of tuple returns which model is more likely to be correct (1 or 2) whereas the second entry returns the ratio of probabilities.

InformationGeometry.OrthVF — Function

OrthVF(DM::DataModel, PL::Plane, θ::Vector{<:Number}; Auto::Val=Val(false)) -> Vector

Calculates a direction (in parameter space) in which the value of the log-likelihood does not change, given a parameter configuration $\theta$. Since a 2D Plane is specified, both the input θ as well as well as the output have 2 components. Auto=Val(true) uses automatic differentiation to calculate the score.

InformationGeometry.OrthVF — Method

OrthVF(DM::DataModel, θ::AbstractVector{<:Number}; Auto::Val=Val(false)) -> Vector

Calculates a direction (in parameter space) in which the value of the log-likelihood does not change, given a parameter configuration $\theta$. Auto=Val(true) uses automatic differentiation to calculate the score.

InformationGeometry.ParallelPlanes — Method

ParallelPlanes(PL::Plane, v::AbstractVector, range) -> Vector{Plane}

Returns Vector of Planes which have been translated by a .* v for all a in range.

InformationGeometry.PlaneCoordinates — Method

PlaneCoordinates(PL::Plane, v::AbstractVector{<:Number})

Returns an n-dimensional vector from a tuple of two real numbers which correspond to the coordinates in the 2D Plane.

InformationGeometry.PlotMatrix — Function

PlotMatrix(Mat::Matrix, MLE::Vector; N::Int=400)

Plots ellipse corresponding to a given covariance matrix which may additionally be offset by a vector MLE.

Example:

PlotMatrix(inv(FisherMetric(DM,MLE)),MLE)

InformationGeometry.PracticallyIdentifiable — Function

PracticallyIdentifiable(DM::AbstractDataModel, Confnum::Real=1; plot::Bool=true, kwargs...) -> Real

Determines the maximum confidence level (in units of standard deviations σ) at which the given DataModel is still practically identifiable.

InformationGeometry.PredictionEnsemble — Function

PredictionEnsemble(DM::AbstractDataModel, pDomain::HyperCube, Xs::AbstractVector{<:Number}=DomainSamples(XCube(DM),300); N::Int=50, uniform::Bool=true, MaxConfnum::Real=3, plot::Bool=true, kwargs...)

Plots N model predictions which are randomly chosen from a confidence region of level MaxConfnum.

InformationGeometry.ProfileBox — Function

ProfileBox(DM::AbstractDataModel, Fs::AbstractVector{<:DataInterpolations.AbstractInterpolation}, Confnum::Real=1.) -> HyperCube

Constructs HyperCube which bounds the confidence region associated with the confidence level Confnum from the interpolated likelihood profiles.

InformationGeometry.ProfileLikelihood — Function

ProfileLikelihood(DM::AbstractDataModel, Confnum::Real=2; N::Int=50, ForcePositive::Bool=false, plot::Bool=true, parallel::Bool=false) -> Vector{Matrix}

Computes the profile likelihood for each component of the parameters $θ \in \mathcal{M}$ over the given Domain. Returns a vector of N×2 matrices where the first column of the n-th matrix specifies the value of the n-th component and the second column specifies the associated confidence level of the best fit configuration conditional to the n-th component being fixed at the associated value in the first column.

The domain over which the profile likelihood is computed is not (yet) adaptively chosen. Instead the size of the domain is estimated from the inverse Fisher metric. Therefore, often has to pass higher value for Confnum to this method than the confidence level one is actually interested in, to ensure that it is still covered (if the model is even practically identifiable in the first place).

InformationGeometry.ProjectOnto — Method

ProjectOnto(v::Vector, u::Vector)

Project v onto u.

InformationGeometry.Pullback — Method

Pullback(DM::DataModel, G::AbstractArray{<:Number,2}, θ::Vector) -> Matrix

Pull-back of a (0,2)-tensor G to the parameter manifold.

InformationGeometry.Pullback — Method

Pullback(DM::AbstractDataModel, ω::AbstractVector{<:Number}, θ::Vector) -> Vector

Pull-back of a covector to the parameter manifold $T*\mathcal{M} \longleftarrow T*\mathcal{D}$.

InformationGeometry.Pushforward — Method

Pushforward(DM::DataModel, X::AbstractVector, θ::AbstractVector) -> Vector

Calculates the push-forward of a vector X from the parameter manifold to the data space $T\mathcal{M} \longrightarrow T\mathcal{D}$.

InformationGeometry.RectToTriangFacets — Method

Turns Array which specifies trapezoidal faces into triangular connections.

InformationGeometry.RectangularFacetIndices — Function

Returns Array of vertex numbers where every row constitutes a trapezoid for two adjacent curves from which N samples have been drawn.

InformationGeometry.ResizeCube — Function

ResizeCube(Cube::HyperCube, factor::Real=1.) -> HyperCube

Resizes given Cube evenly in all directions but keeps center of mass fixed.

InformationGeometry.Ricci — Method

Ricci(DM::DataModel, θ::AbstractVector; BigCalc::Bool=false)
Ricci(Metric::Function, θ::AbstractVector; BigCalc::Bool=false)

Calculates the components of the $(0,2)$ Ricci tensor by finite differencing of the Metric. BigCalc=true increases accuracy through BigFloat calculation.

InformationGeometry.RicciScalar — Method

RicciScalar(DM::DataModel, θ::AbstractVector; BigCalc::Bool=false) -> Real
RicciScalar(Metric::Function, θ::AbstractVector; BigCalc::Bool=false) -<> Real

Calculates the Ricci scalar by finite differencing of the Metric. BigCalc=true increases accuracy through BigFloat calculation.

InformationGeometry.Riemann — Method

Riemann(DM::DataModel, θ::AbstractVector; BigCalc::Bool=false)
Riemann(Metric::Function, θ::AbstractVector; BigCalc::Bool=false)

Calculates the components of the $(1,3)$ Riemann tensor by finite differencing of the Metric. BigCalc=true increases accuracy through BigFloat calculation.

InformationGeometry.RobustFit — Method

RobustFit(DM::AbstractDataModel, start::Vector{<:Number}; tol::Real=1e-10, p::Real=1, kwargs...)

Uses p-Norm to judge distance on Dataspace as specified by the keyword.

InformationGeometry.Rsquared — Method

Rsquared(DM::DataModel) -> Real

Calculates the R² value associated with the maximum likelihood estimate of a DataModel. It should be noted that the R² value is only a valid measure for the goodness of a fit for linear relationships.

InformationGeometry.SaveConfidence — Function

SaveConfidence(sols::Vector{<:AbstractODESolution}, N::Int=500; sigdigits::Int=7, adaptive::Bool=true) -> Matrix
SaveConfidence(Planes::Vector{<:Plane}, sols::Vector{<:AbstractODESolution}, N::Int=500; sigdigits::Int=7, adaptive::Bool=true) -> Matrix

Returns a Matrix of with N rows corresponding to the number of evaluations of each ODESolution in sols. The colums correspond to the various components of the evaluated solutions. E.g. for an ODESolution with 3 components, the 4. column in the Matrix corresponds to the evaluated first components of sols[2].

InformationGeometry.SaveDataSet — Method

SaveDataSet(DS::DataSet; sigdigits::Int=0)

Returns a DataFrame whose columns respectively constitute the x-values, y-values and standard distributions associated with the data points. For sigdigits > 0 the values are rounded to the specified number of significant digits.

InformationGeometry.SaveGeodesics — Function

SaveGeodesics(sols::Vector{<:AbstractODESolution}, N::Int=500; sigdigits::Int=7, adaptive::Bool=true) -> Matrix

Returns a Matrix of with N rows corresponding to the number of evaluations of each ODESolution in sols. The colums correspond to the various components of the evaluated solutions. Since the solution objects for geodesics contain the velocity as the second half of the components, only the first half of the components is saved.

InformationGeometry.Score — Function

Score(DM::DataModel, θ::AbstractVector{<:Number}; Auto::Val=Val(false))

Calculates the gradient of the log-likelihood $\ell$ with respect to a set of parameters $\theta$. Auto=Val(true) uses automatic differentiation.

InformationGeometry.TotalLeastSquares — Function

TotalLeastSquares(DSE::DataSetExact, model::ModelOrFunction, initial::AbstractVector{<:Number}; tol::Real=1e-13, kwargs...) -> Vector

Experimental feature which takes into account uncertainties in x-values to improve the accuracy of the fit. Returns concatenated vector of x-values and parameters. Assumes that the uncertainties in the x-values and y-values are normal, i.e. Gaussian!

InformationGeometry.Transform — Function

Transform(DM::AbstractDataModel, F::Function, idxs=trues(pdim(DM))) -> DataModel
Transform(model::Function, idxs, F::Function) -> Function

Transforms the parameters of the model by the given scalar function F such that newmodel(x, θ) = oldmodel(x, F.(θ)). By providing idxs, one may restrict the application of the function F to specific parameter components.

InformationGeometry.TranslateCube — Method

TranslateCube(Cube::HyperCube,x::Vector{<:Number}) -> HyperCube

Returns a HyperCube object which has been translated by x.

InformationGeometry.Unpack — Method

Unpack(Z::Vector{S}) where S <: Union{Vector,Tuple} -> Matrix

Converts vector of vectors to a matrix whose n-th column corresponds to the n-th component of the inner vectors.

InformationGeometry.VFRescale — Method

VFRescale(ZeilenVecs::Array{<:Number,2},C::HyperCube;scaling=0.85)

Rescale vector to look good in 2D plot.

InformationGeometry.VisualizeSols — Method

VisualizeSols(sols::Vector{<:AbstractODESolution}; OverWrite::Bool=true)

Visualizes vectors of type ODESolution using the Plots.jl package. If OverWrite=false, the solution is displayed on top of the previous plot object.

InformationGeometry.Weyl — Method

(0,4) Weyl curvature tensor. NEEDS TESTING.

InformationGeometry.WilksCriterion — Function

Point θ lies outside confidence region of level Confvol if this function > 0.

InformationGeometry.WilksTest — Function

WilksTest(DM::DataModel, θ::AbstractVector{<:Number}, Confvol=ConfVol(1)) -> Bool

Checks whether a given parameter configuration θ is within a confidence interval of level Confvol using Wilks' theorem. This makes the assumption, that the likelihood has the form of a normal distribution, which is asymptotically correct in the limit that the number of datapoints is infinite. The keyword dof can be used to manually specify the degrees of freedom.

InformationGeometry.invert — Function

invert(F::Function, x::Number; tol::Real=GetH(x)) -> Real

Finds $z$ such that $F(z) = x$ to a tolerance of tol for continuous $F$ using Roots.jl. Ideally, F should be monotone and there should only be one correct result.

InformationGeometry.isinside — Method

isinside(p, pol; allowonedge=false) -> Bool

Is a point p inside a polygon defined by a counterclockwise list of points.

InformationGeometry.likelihood — Method

likelihood(DM::DataModel, θ::AbstractVector) -> Real

Calculates the likelihood $L(\mathrm{data} \, | \, \theta)$ a DataModel and a parameter configuration $\theta$.

InformationGeometry.minimize — Function

minimize(F::Function, start::Vector{<:Number}; tol::Real=1e-10, meth=NelderMead(), Full::Bool=false, timeout::Real=200, kwargs...) -> Vector

Minimizes the scalar input function using the given start using algorithms from Optim.jl specified via the keyword meth. Full=true returns the full solution object instead of only the minimizing result. Optionally, the search domain can be bounded by passing a suitable HyperCube object as the third argument.

InformationGeometry.pdim — Method

pdim(DS::AbstractDataSet, model::ModelOrFunction) -> Int

Infers the (minimal) number of components that the given function F accepts as input by successively testing it on vectors of increasing length.

InformationGeometry.suff — Method

suff(x) -> Type

If x stores BigFloats, suff returns BigFloat, else suff returns Float64.

StatsBase.loglikelihood — Function

loglikelihood(DM::DataModel, θ::AbstractVector) -> Real

Calculates the logarithm of the likelihood $L$, i.e. $\ell(\mathrm{data} \, | \, \theta) \coloneqq \mathrm{ln} \big( L(\mathrm{data} \, | \, \theta) \big)$ given a DataModel and a parameter configuration $\theta$.