CausalityToolsBase.CausalityEstimatorType

CausalityEstimator

An abstract type that is the supertype of all causality estimators in the CausalityTools ecosystem.

The naming convention for abstract subtypes is SomeMethodEstimator. Examples of abstract estimator type hierarchies could be:

  • TransferEntropyEstimator <: CausalityEstimator
  • CrossMappingEstimator <: CausalityEstimator

Specific estimator types are named according to the algorithm. Examples of complete type hierarchies for different estimators could be:

  • VisitationFrequency <: TransferEntropyEstimator <: CausalityEstimator.
  • TransferOperatorGrid <: TransferEntropyEstimator <: CausalityEstimator.
  • SimpleCrossMap <: CrossMappingEstimator <: CausalityEstimator.
  • ConvergentCrossMap <: CrossMappingEstimator <: CausalityEstimator.
  • JointDistanceDistribution <: JointDistanceDistributionEstimator <: CausalityEstimator.

Each estimator type, also the abstract ones, have a corresponding parameter type where Estimator is replaced by Test, for example:

  • VisitationFrequencyTest <: TransferEntropyTest <: CausalityTest.
  • TransferOperatorGridTest <: TransferEntropyTest <: CausalityTest.
CausalityToolsBase.CausalityTestType
CausalityTest

An abstract type that is the supertype of all causality tests in the CausalityTools ecosystem.

The naming convention for abstract subtypes is SomeMethodTest. Examples of the type hierarchy of abstract test types could be:

  • TransferEntropyTest <: CausalityTest
  • CrossMappingTest <: CausalityTest

Subtypes of those abstract types are named according to the specific algorithm. Examples of complete type hierachies for specific causality test types could be:

  • VisitationFrequencyTest <: TransferEntropyTest <: CausalityTest.
  • TransferOperatorGridTest <: TransferEntropyTest <: CausalityTest.
  • CrossMappingTest <: DistanceBasedTest <: CausalityTest.
CausalityToolsBase.OptimiseDelayType
OptimiseDelay(method_delay = "ac_zero", maxdelay_frac = 0.1; kwargs...) -> OptimiseDelay

Indicates that the delay parameter for an embedding should be optimised using some estimation procedure.

Passing an instance of OptimiseDelay to certain functions triggers delay estimation based on the length of the time series, which is not necessarily known beforehand. Here, the maximum lag is expressed as a fraction of the time series length.

Fields

  • method_delay::String = "ac_zero": The delay estimation method. Uses DynamicalSystems.estimate_delay under the hood. See its documentation for more info.
  • maxdelay_frac::Number = 0.1: The maximum number of delays for which to check, expressed as a fraction of the time series length.
  • kwargs::NamedTuple: Arguments to the various methods. Empty by default. Keywords nbins and binwidth are propagated into DynamicalSystems.mutualinformation if method = mi_min.

Example

opt_scheme = OptimiseDelay(method_delay = "mi_min", kwargs = (nbins = 10, ))
ts = sin.(diff(diff(rand(5000))))
optimal_delay(ts, opt_scheme)
source
CausalityToolsBase.OptimiseDimType
OptimiseDim(method_delay::String = "ac_zero", maxdelay_frac::Number = 0.1, 
    method_dim::String = "f1nn", maxdim::Int = 6) -> OptimiseDim

Indicates that the dimension for an embedding should be optimised using some estimation procedure.

To estimate the dimension, the delay lag must also be specified. Therefore, passing an instance of OptimiseDim to certain functions triggers delay estimation based on the length of the time series, which is not necessarily known beforehand. Then, after the delay has been estimated, the dimension is estimated.

Fields

  • method_dim::String = "f1nn": The dimension estimation method.
  • maxdim::Int = 6: The maximum dimension to check for. Dimensions 1:maxdim will be checked.
  • kwargs_dim::NamedTuple: Keyword arguments to the dimension estimation method. Empty by default.
  • method_delay::String = "ac_zero": The delay estimation method.
  • maxdelay_frac::Number = 0.1: The maximum number of delays for which to check, expressed as a fraction of the time series length.
  • kwargs_delay::NamedTuple: Keyword arguments to the delay estimation method. Empty by default. Keywords nbins and binwidth are propagated into DynamicalSystems.mutualinformation if method = mi_min. See also optimal_delay.

Example

opt_scheme = OptimiseDim(method_dim = "f1nn", method_delay = "ac_zero")
ts = sin.(diff(diff(rand(5000))))
optimal_dimension(ts, opt_scheme)
source
CausalityToolsBase.RectangularBinningType
RectangularBinning(ϵ) <: RectangularBinningScheme

Instructions for creating a rectangular box partition using the binning scheme ϵ.

Types of rectangular binning schemes

Data-dictated ranges along each axis

  1. ϵ::Int divides each axis into ϵ equal-length intervals, extending the upper bound 1/100th of a bin size to ensure all points are covered.

  2. ϵ::Float64 divides each axis into intervals of fixed size ϵ.

  3. ϵ::Vector{Int} divides the i-th axis into ϵᵢ equal-length intervals, extending the upper bound 1/100th of a bin size to ensure all points are covered.

  4. ϵ::Vector{Float64} divides the i-th axis into intervals of size ϵ[i].

In these cases, the rectangular partition is constructed by locating the minima along each coordinate axis, then constructing equal-length intervals until the data maxima are covered.

Custom ranges along each axis

Rectangular binnings may also be specified on arbitrary min-max ranges.

  1. ϵ::Tuple{Vector{Tuple{Float64,Float64}},Int64} creates intervals along each axis from ranges indicated by a vector of (min, max) tuples, then divides each axis into the same integer number of equal-length intervals.

It's probably easier to use the following constructors

  • RectangularBinning(RectangularBinning(minmaxes::Vararg{<:AbstractRange{T}, N}; n_intervals::Int = 10)) takes a vector of tuples indiciating the (min, max) along each axis and n_intervals that indicates how many equal-length intervals those ranges should be split into.
  • RectangularBinning(minmaxes::Vector{<:AbstractRange{T}}, n_intervals::Int) does the same, but the arguments are provided as ranges.

Examples

Minimal and maximal positions of the grid determined by the data points:

  • RectangularBinning(10): find the minima along each coordinate axis of the points, then split the (extended) range into 10 equal-length intervals.
  • RectangularBinning([10, 5]): find the minima along each coordinate axis of the points, then split the (extended) range along the first coordinate axis into 10 equal-length intervals and the range along the second coordinate axis into 5 equal-length intervals.
  • RectangularBinning(0.5): find the minima along each coordinate axis of the points, then split the axis ranges into equal-length intervals of size 0.5
  • RectangularBinning([0.3, 0.1]): find the minima along each coordinate axis of the points, then split the range along the first coordinate axis into equal-length intervals of size 0.3 and the range along the second axis into equal-length intervals of size 0.1.

Explitly specifying data ranges (not guaranteed to cover data points):

  • RectangularBinning(-5:5, 2:2, n_intervals = 5): split the ranges -5:5 and 2:2 into n_intervals equal-length intervals.
CausalityToolsBase.RefinedTriangulationBinningMaxRadiusType
RefinedTriangulationBinningMaxRadius

A binning scheme for a triangulated simplex partition where some simplices have been refined (subdivided by a shape-preserving simplex subdivision algorithm).

The maximum radius bound is applied by first doing an initial triangulation, the splitting simplices whose radius is large until all simplices have radii less than the resulting radius bound.

Fields

  • max_radius_frac::Float64: The maximum radius expressed as a fraction of the

radius of the largest simplex of the initial triangulation.

CausalityToolsBase.RefinedTriangulationBinningSplitFactorType
RefinedTriangulationBinningSplitFactor

A binning scheme for a triangulated simplex partition where some simplices have been refined (subdivided by a shape-preserving simplex subdivision algorithm).

The split factor bound controls how many times each simplex of the initial triangulation is to be split.

Fields

  • simplex_split_factor::Int: The number of times each simplex is split.
CausalityToolsBase.RefinedTriangulationBinningSplitQuantileType
RefinedTriangulationBinningSplitQuantile

A binning scheme for a triangulated simplex partition where some simplices have been refined (subdivided by a shape-preserving simplex subdivision algorithm).

The split fraction bound controls how many times each simplex of the initial triangulation is to be split.

Fields

  • split_quantile::Float64: All simplices with radius larger than the split_quantile-th quantile of the radii of the simplices in initial triangulation are split with a splitting factor of simplex_split_factor.
  • simplex_split_factor::Int: The number of times each simplex is split.
CausalityToolsBase.encodeFunction
encode(point, reference_point, edgelengths)

Encode a point into its integer bin labels relative to some reference_point (always counting from lowest to highest magnitudes), given a set of box edgelengths (one for each axis). The first bin on the positive side of the reference point is indexed with 0, and the first bin on the negative side of the reference point is indexed with -1.

Example

using CausalityToolsBase

refpoint = [0, 0, 0]
steps = [0.2, 0.2, 0.3]
encode(rand(3), refpoint, steps)
CausalityToolsBase.generate_gridpointsFunction
generate_gridpoints(points, binning_scheme::RectangularBinning, 
    grid::GridType = OnGrid())

Return a set of points forming a rectangular grid covering a hyperrectangular box specified by the binning_scheme and grid type. Provided a suitable binning scheme is given, this grid will provide a covering of points. See the documentation for RectangularBinning for more details.

Arguments

  • points: A vector of points or a Dataset instance.

  • binning_scheme: A RectangularBinning instance. See docs for RectangularBinning for more details.

  • grid: A GridType instance. The grid follows the same convention as in Interpolations.jl. Valid choices are OnGrid() (uses the bin origins as the grid points), and OnCell(), which adds an additional interval along each axis, shifts the grid half a bin outside the extrema along each axis and retursn the centers of the resulting grid cells.

Examples

For example,

using CausalityToolsBase, DelayEmbeddings

pts = Dataset([rand(3) for i = 1:100])
generate_gridpoints(pts, RectangularBinning(10), OnGrid())

generates a rectangular grid covering the range of pts constructed by subdividing each coordinate axis into 10 equal-length intervals. Next,

using CausalityToolsBase, DelayEmbeddings

pts = Dataset([rand(3) for i = 1:100])
generate_gridpoints(pts, RectangularBinning(10), OnCell())

will do the same, but adds another interval (11 in total), shifts the entire hypercube so that the minima and maxima along each axis lie half a bin outside the original extrema, then returns the centers of the grid cells.

CausalityToolsBase.generate_gridpointsFunction
generate_gridpoints(axisminima, stepsizes, n_intervals_eachaxis, 
    grid::GridType = OnGrid())

Return a set of points forming a grid over the hyperrectangular box spanned by

  • (axisminima, axisminima .+ (n_intervals_eachaxis .* stepsizes) if grid = OnGrid(), and

  • (axisminima, axisminima .+ ((n_intervals_eachaxis .+ 1) .* stepsizes) if grid = OnCell(),

where the minima along each coordinate axis (axisminima), the stepsizes along each axis, and the set of intervals (n_intervals_per_axis) indicating how many equal-length intervals each axis should be divided into.

If grid = OnGrid(), then the bin origins are taken as the grid points. If grid = OnCell(), then one additional interval is added and the grid is shifted half a bin outside the extrema along each axis, so that the grid points lie at the center of the grid cells.

CausalityToolsBase.get_edgelengthsFunction
get_edgelengths(pts, binning_scheme::RectangularBinning) -> Vector{Float}

Return the box edge length along each axis resulting from discretizing pts on a rectangular grid specified by binning_scheme.

Example

using DynamicalSystems, CausalityToolsBase
pts = Dataset([rand(5) for i = 1:1000])

get_edgelengths(pts, RectangularBinning(0.6))
get_edgelengths(pts, RectangularBinning([0.5, 0.3, 0.3, 0.4, 0.4]))
get_edgelengths(pts, RectangularBinning(8))
get_edgelengths(pts, RectangularBinning([10, 8, 5, 4, 22]))
CausalityToolsBase.get_minima_and_edgelengthsMethod
get_minima_and_edgelengths(points, 
    binning_scheme::RectangularBinning) -> (Vector{Float}, Vector{Float})

Find the minima along each axis of the embedding, and computes appropriate edge lengths given a rectangular binning_scheme, which provide instructions on how to grid the space. Assumes the input is a vector of points.

See documentation for RectangularBinning for details on the binning scheme.

Example

using DynamicalSystems, CausalityToolsBase
pts = Dataset([rand(4) for i = 1:1000])

get_minima_and_edgelengths(pts, RectangularBinning(0.6))
get_minima_and_edgelengths(pts, RectangularBinning([0.5, 0.3, 0.4, 0.4]))
get_minima_and_edgelengths(pts, RectangularBinning(10))
get_minima_and_edgelengths(pts, RectangularBinning([10, 8, 5, 4]))
CausalityToolsBase.get_minmaxesFunction
get_minmaxes(pts) -> Tuple{Vector{Float}, Vector{Float}}

Return a vector of tuples containing axis-wise (minimum, maximum) values.

CausalityToolsBase.joint_visitsMethod
joint_visits(points, binning_scheme::RectangularBinning)

Determine which bins are visited by points given the rectangular binning scheme ϵ. Bins are referenced relative to the axis minimum.

Example

using DynamicalSystems, CausalityToolsBase

pts = Dataset([rand(5) for i = 1:100]);
joint_visits(pts, RectangularBinning(0.2))
CausalityToolsBase.kerneldensityMethod
kerneldensity(pts, gridpts, kernel::BoxKernel; 
    h = silverman_rule(pts), 
    metric::Metric = Chebyshev(), 
    normalise = true) -> Vector{Float64}

Naive box kernel density estimator from [1].

Arguments

  • pts: The points for which to evaluate the density.

  • gridpts: A set of grid point on which to evaluate the density.

  • kernel: A Kernel type. Defaults to BoxKernel. Can also be GaussianKernel.

Keyword arguments

  • h: The bandwidth. Uses Silverman's rule to compute an optimal bandwidth assuming a Gaussian density (note: we're not using a Gaussian kernel here, so might be off).

  • gridpts: A set of grid point on which to evaluate the density.

  • normalise: Normalise the density so that it sums to 1.

  • metric: A instance of a valid metric from Distances.jl that is nonnegative, symmetric and satisfies the triangle inequality. Defaults to metric = Chebyshev().

Returns

A density estimate for each grid point.

References

[1] Steuer, R., Kurths, J., Daub, C.O., Weise, J. and Selbig, J., 2002. The mutual information: detecting and evaluating dependencies between variables. Bioinformatics, 18(suppl_2), pp.S231-S240.

Example

using DynamicalSystems, CausalityToolsBase, Distributions 

# Create some example points from a multivariate normal distribution
d = MvNormal(rand(Uniform(-1, 1), 2), rand(Uniform(0.1, 0.9), 2))
pts = Dataset([rand(d) for i = 1:500])

# Evaulate the density at a subset of those points given all the points
gridpts = Dataset([SVector{2, Float64}(pt) for pt in pts[1:5:end]])

# Get normalised density 
kd_norm = kerneldensity(pts, gridpts, BoxKernel(), normalise = true);
kd_nonnorm = kerneldensity(pts, gridpts, BoxKernel(), normalise = false);

# Make sure the result sums to one of normalised and that it doesn't when not normalising
sum(kd_norm) ≈ 1
!(sum(kd_nonnorm) ≈ 1)
CausalityToolsBase.marginal_visitsMethod
marginal_visits(joint_visits, dims)

Given a set of precomputed joint visited bins over some binning, return the marginal along dimensions dims.

Example

using DynamicalSystems, CausalityToolsBase

pts = Dataset([rand(5) for i = 1:100]);

# First compute joint visits, then marginal visits along dimensions 1 and 4
jv = joint_visits(pts, RectangularBinning(0.2))
marginal_visits(jv, [1, 4])
CausalityToolsBase.marginal_visitsMethod
marginal_visits(points, binning_scheme::RectangularBinning, dims)

Determine which bins are visited by points given the rectangular binning scheme ϵ, only along the desired dimensions dims. Bins are referenced relative to the axis minimum.

Example

using DynamicalSystems, CausalityToolsBase

pts = Dataset([rand(5) for i = 1:100]);

# Marginal visits along dimension 3 and 5
marginal_visits(pts, RectangularBinning(0.3), [3, 5])

# Marginal visits along dimension 2 through 5
marginal_visits(pts, RectangularBinning(0.3), 2:5)
CausalityToolsBase.optimal_delayMethod
optimal_delay(x, p::OptimiseDelay)

Estimate the optimal delay reconstruction lag for x using the instructions given by the OptimiseDelay instance p.

Example

opt_scheme = OptimiseDelay(method_delay = "ac_zero", kwargs = (nbins = 10, ))
ts = sin.(diff(diff(rand(5000))))
optimal_delay(ts, opt_scheme)
CausalityToolsBase.optimal_delayMethod
optimal_delay(v; method = "mi_min"; τs = 1:1:floor(Int, length(v)/10); kwargs...)

Estimate the optimal embedding lag for v among the delays τs.

Keyword arguments

  • method::String = "mi_min": The delay estimation method. Uses DynamicalSystems.estimate_delay under the hood. See its documentation for more info.
  • τs: The lags over which to estimate the embedding lag. Defaults to 10% of the length of the time series.
  • kwargs::NamedTuple: Keyword arguments to the delay estimation methods. Empty by default. Keywords nbins, binwidth are propagated into DynamicalSystems.mutualinformation.

Example

using CausalityToolsBase 

ts = diff(rand(100))
optimal_delay(ts)
optimal_delay(ts, method = "ac_zero")
optimal_delay(ts, method = "mi_min", τs = 1:10)
CausalityToolsBase.optimal_dimensionFunction
optimal_dimension(v, τ; dims = 2:8; method = "fnn"; kwargs...)

Estimate the optimal embedding dimension for v.

Arguments

  • v: The data series for which to estimate the embedding dimension.

  • τ: The embedding lag.

  • dims: Dimensions to probe for the optimal dimension.

Keyword arguments

  • method: Either "fnn" (Kennel's false nearest neighbors method), "afnn" (Cao's average false nearest neighbors method) or "f1nn" (Krakovská's false first nearest neighbors method). See the source code for DelayEmbeddings.estimate_dimension for more details.

  • rtol: Tolerance rtol in Kennel's algorithms. See DelayEmbeddings.fnn source code for more details.

  • atol: Tolerance rtol in Kennel's algorithms. See DelayEmbeddings.fnn source code for more details.

Example

using CausalityToolsBase 
        
ts = diff(rand(1000))
optimal_dimension(ts)
optimal_dimension(ts, dims = 3:5)
optimal_dimension(ts, method = "afnn")
optimal_dimension(ts, method = "fnn")
optimal_dimension(ts, method = "f1nn")
CausalityToolsBase.optimal_dimensionMethod
optimal_dimension(v; dims = 2:8,
    method_dimension = "fnn", method_delay = "ac_zero")

Estimate the optimal embedding dimension for v by first estimating the optimal lag, then using that lag to estimate the dimension.

Arguments

  • v: The data series for which to estimate the embedding dimension.
  • dims: The dimensions to try.
  • method_delay: The method for determining the optimal lag.
CausalityToolsBase.optimal_dimensionMethod
optimal_dimension(x, p::OptimiseDim)

Estimate the optimal reconstruction dimension for x using the instructions given by the OptimiseDim instance p.

Example

opt_scheme = OptimiseDim(method_dim = "f1nn", method_delay = "mi_min", kwargs_delay = (nbins = 10, )))
ts = sin.(diff(diff(rand(5000))))
optimal_dimension(ts, opt_scheme)
CausalityToolsBase.silverman_ruleMethod
silverman_rule(pts)

Find the approximately optimal bandwidth for a kernel density estimate, assuming the density is Gaussian (Silverman, 1996).

ChaosTools.non0histMethod
non0hist(points, binning_scheme::RectangularBinning, dims)

Determine which bins are visited by points given the rectangular binning_scheme, considering only the marginal along dimensions dims. Bins are referenced relative to the axis minima.

Returns the unordered histogram (visitation frequency) over the array of bin visits.

This method extends ChaosTools.non0hist.

Example

using DelayEmbeddings
pts = Dataset([rand(5) for i = 1:100]);

# Histograms directly from points given a rectangular binning scheme
h1 = non0hist(pts, RectangularBinning(0.2), 1:3) 
h2 = non0hist(pts, RectangularBinning(0.2), [1, 2])

# Test that we're actually getting normalised histograms 
sum(h1) ≈ 1.0, sum(h2) ≈ 1.0
ChaosTools.non0histMethod
non0hist(bin_visits)

Return the unordered histogram (vistitation frequency) over the array of bin_visits, which is a vector containing bin encodings (each point encoded by an integer vector).

This method extends ChaosTools.non0hist.

Example

using DynamicalSystems, CausalityToolsBase
pts = Dataset([rand(5) for i = 1:100]);

# Histograms from precomputed joint/marginal visitations 
jv = joint_visits(pts, RectangularBinning(10))
mv = marginal_visits(pts, RectangularBinning(10), 1:3)

h1 = non0hist(jv)
h2 = non0hist(mv)

# Test that we're actually getting a normalised histograms
sum(h1) ≈ 1.0, sum(h2) ≈ 1.0
CausalityToolsBase.evaluate_kernelFunction
 evaluate_kernel(kerneltype::Kernel, args...)

Evaluate the kernel function of type kerneltype with the provided args.

Example

  • evaluate_kernel(GaussianKernel(), d, σ evaluates the Gaussian

kernel for the distance d and average marginal standard deviation σ.

CausalityToolsBase.evaluate_kernelMethod
evaluate_kernel(BoxKernel(), idxs_pts_within_range)

Evaluate the the Box kernel by counting the number of points that fall within the range of a query point (the points falling inside a radius of h has been precomputed).

CausalityToolsBase.scalingFunction
scaling(kernel::Kernel, n_pts, h, dim)

Return the scaling factor for kernel for a given number of points n_pts, bandwidth h in dimension dim.