CausalityTools.CausalityToolsModule

CausalityTools

CI codecov DOI

CausalityTools.jl is a package for quantifying associations and dynamical coupling between datasets, independence testing and causal inference.

All further information is provided in the documentation, which you can either find online or build locally by running the docs/make.jl file.

Key features

  • Association measures from conventional statistics, information theory and dynamical systems theory, for example distance correlation, mutual information, transfer entropy, convergent cross mapping and a lot more!
  • A dedicated API for independence testing, which comes with automatic compatibility with every measure-estimator combination you can think of. For example, we offer the generic SurrogateTest, which is fully compatible with TimeseriesSurrogates.jl, and the LocalPermutationTest for conditional indepencence testing.
  • A dedicated API for causal network inference based on these measures and independence tests.

Installation

To install the package, run import Pkg; Pkg.add("CausalityTools").

CausalityTools.AR1BidirType
AR1Bidir <: DiscreteDefinition
AR1Bidir(;xi = [0.2, 0.3], a₁ = 0.5, b₁ = 0.7, c_xy = 0.1, c_yx = 0.2,
    nx = Normal(0, 0.3), ny = Normal(0, 0.3),
    rng::R = Random.default_rng())

A system consisting of two mutually coupled first order autoregressive processes.

Equations of motion

\[\begin{aligned} x(t+1) &= a_{1}x + c_{yx}y + \epsilon_{x} \\ y(t+1) &= b_{1}y + c_{xy}x + \epsilon_{y} \end{aligned}\]

where at each time step, $\epsilon_{x}$ and $\epsilon_{y}$ are drawn from independent normal distributions nx and ny, respectively.

CausalityTools.AR1UnidirType
AR1Unidir <: DiscreteDefinition
AR1Unidir(; ui = [0.2, 0.3], a₁ = 0.90693, b₁ = 0.40693, c_xy = 0.5,
    nx = Normal(0, 0.40662), ny = Normal(0, 0.40662),
    rng::R = Random.default_rng())

A bivariate, order one autoregressive model, where $x \to y$ (Paluš et al, 2018)[Paluš2018].

Equations of motion

\[\begin{aligned} x(t+1) &= a_1 x(t) + \xi_{1} \\ y(t+1) &= b_1 y(t) - c_{xy} x + \xi_{2}, \end{aligned}\]

where $\xi_{1}$ and $\xi_{2}$ are drawn from normal distributions nx and ny at each iteration.

CausalityTools.AmplitudeType
Amplitude <: InstantaneousSignalProperty

Indicates that the instantaneous amplitudes of a signal should be used.

CausalityTools.AnishchenkoType
Anishchenko <: DiscreteDefinition
Anishchenko(;u₀ = rand(2), α =3.277, s=0.1, ω=0.5*(sqrt(5)-1)) → DiscreteDynamicalSystem

Initialise the system defined by eq. 13 in Anishchenko1998, which can give strange, nonchaotic attractors.

Equations of motion

\[\begin{aligned} dx &= \alpha (1-s \cos (2 \pi \phi )) \cdot x(1-x) \\ dϕ &= (\phi + \omega ) \mod{1} \end{aligned}\]

CausalityTools.CEShannonType
CEShannon <: ConditionalEntropy
CEShannon(; base = 2,)

TheShannon conditional entropy measure.

Discrete definition

Sum formulation

The conditional entropy between discrete random variables $X$ and $Y$ with finite ranges $\mathcal{X}$ and $\mathcal{Y}$ is defined as

\[H^{S}(X | Y) = -\sum_{x \in \mathcal{X}, y \in \mathcal{Y}} = p(x, y) \log(p(x | y)).\]

This is the definition used when calling entropy_conditional with a ContingencyMatrix.

Two-entropies formulation

Equivalently, the following difference of entropies hold

\[H^S(X | Y) = H^S(X, Y) - H^S(Y),\]

where $H^S(\cdot$ and $H^S(\cdot | \cdot)$ are the Shannon entropy and Shannon joint entropy, respectively. This is the definition used when calling entropy_conditional with a ProbabilitiesEstimator.

Differential definition

The differential conditional Shannon entropy is analogously defined as

\[H^S(X | Y) = h^S(X, Y) - h^S(Y),\]

where $h^S(\cdot$ and $h^S(\cdot | \cdot)$ are the Shannon differential entropy and Shannon joint differential entropy, respectively. This is the definition used when calling entropy_conditional with a DifferentialEntropyEstimator.

CausalityTools.CETsallisAbeType
CETsallisAbe <: ConditionalEntropy
CETsallisAbe(; base = 2, q = 1.5)

Abe2001's discrete Tsallis conditional entropy measure.

Definition

Abe & Rajagopal's Tsallis conditional entropy between discrete random variables $X$ and $Y$ with finite ranges $\mathcal{X}$ and $\mathcal{Y}$ is defined as

\[H_q^{T_A}(X | Y) = \dfrac{H_q^T(X, Y) - H_q^T(Y)}{1 + (1-q)H_q^T(Y)},\]

where $H_q^T(\cdot)$ and $H_q^T(\cdot, \cdot)$ is the Tsallis entropy and the joint Tsallis entropy.

CausalityTools.CETsallisFuruichiType
CETsallisFuruichi <: ConditionalEntropy
CETsallisFuruichi(; base = 2, q = 1.5)

Furuichi (2006)'s discrete Tsallis conditional entropy measure.

Definition

Furuichi's Tsallis conditional entropy between discrete random variables $X$ and $Y$ with finite ranges $\mathcal{X}$ and $\mathcal{Y}$ is defined as

\[H_q^T(X | Y) = -\sum_{x \in \mathcal{X}, y \in \mathcal{Y}} p(x, y)^q \log_q(p(x | y)),\]

when $q \neq 1$. For $q = 1$, $H_q^T(X | Y)$ reduces to the Shannon conditional entropy:

\[H_{q=1}^T(X | Y) = -\sum_{x \in \mathcal{X}, y \in \mathcal{Y}} = p(x, y) \log(p(x | y))\]

CausalityTools.CMIRenyiJizbaType
CMIRenyiJizba <: ConditionalMutualInformation

The Rényi conditional mutual information $I_q^{R_{J}}(X; Y | Z$ defined in Jizba2012.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise dependence.
  • Use with condmutualinfo to compute the raw conditional mutual information.

Definition

\[I_q^{R_{J}}(X; Y | Z) = I_q^{R_{J}}(X; Y, Z) - I_q^{R_{J}}(X; Z),\]

where $I_q^{R_{J}}(X; Z)$ is the MIRenyiJizba mutual information.

CausalityTools.CMIRenyiPoczosType
CMIRenyiPoczos <: ConditionalMutualInformation

The differential Rényi conditional mutual information $I_q^{R_{P}}(X; Y | Z)$ defined in (Póczos & Schneider, 2012)[Póczos2012].

Usage

  • Use with independence to perform a formal hypothesis test for pairwise dependence.
  • Use with condmutualinfo to compute the raw conditional mutual information.

Definition

\[\begin{align*} I_q^{R_{P}}(X; Y | Z) = \dfrac{1}{q-1} \int \int \int \dfrac{p_Z(z) p_{X, Y | Z}^q}{( p_{X|Z}(x|z) p_{Y|Z}(y|z) )^{q-1}} \\ \mathbb{E}_{X, Y, Z} \sim p_{X, Y, Z} \left[ \dfrac{p_{X, Z}^{1-q}(X, Z) p_{Y, Z}^{1-q}(Y, Z) }{p_{X, Y, Z}^{1-q}(X, Y, Z) p_Z^{1-q}(Z)} \right] \end{align*}\]

CausalityTools.CMIRenyiSarbuType
CMIRenyiSarbu <: ConditionalMutualInformation
CMIRenyiSarbu(; base = 2, definition = CMIRenyiSarbuSarbu())

The Rényi conditional mutual information from Sarbu2014.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise dependence.
  • Use with condmutualinfo to compute the raw conditional mutual information.

Discrete description

Assume we observe three discrete random variables $X$, $Y$ and $Z$. Sarbu (2014) defines discrete conditional Rényi mutual information as the conditional Rényi $\alpha$-divergence between the conditional joint probability mass function $p(x, y | z)$ and the product of the conditional marginals, $p(x |z) \cdot p(y|z)$:

\[I(X, Y; Z)^R_q = \dfrac{1}{q-1} \sum_{z \in Z} p(Z = z) \log \left( \sum{x \in X}\sum{y \in Y} \dfrac{p(x, y|z)^q}{\left( p(x|z)\cdot p(y|z) \right)^{q-1}} \right)\]

See also: condmutualinfo.

CausalityTools.CMIShannonType
CMIShannon <: ConditionalMutualInformation
CMIShannon(; base = 2)

The Shannon conditional mutual information (CMI) $I^S(X; Y | Z)$.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise dependence.
  • Use with condmutualinfo to compute the raw conditional mutual information.

Supported definitions

Consider random variables $X \in \mathbb{R}^{d_X}$ and $Y \in \mathbb{R}^{d_Y}$, given $Z \in \mathbb{R}^{d_Z}$. The Shannon conditional mutual information is defined as

\[\begin{align*} I(X; Y | Z) &= H^S(X, Z) + H^S(Y, z) - H^S(X, Y, Z) - H^S(Z) \\ &= I^S(X; Y, Z) + I^S(X; Y) \end{align*},\]

where $I^S(\cdot; \cdot)$ is the Shannon mutual information MIShannon, and $H^S(\cdot)$ is the Shannon entropy.

Differential Shannon CMI is obtained by replacing the entropies by differential entropies.

See also: condmutualinfo.

CausalityTools.ChaoticMaps3Type
ChaoticMaps3() <: DiscreteDefinition
ChaoticMaps3(; ui = [0.2, 0.1, 0.3], r = 3.4, c_xy = 0.5, c_xz = 0.5, c_yz = 0.3)

A model consisting of three coupled 1D maps, where $x \to y$ and $x \to z$ Chen2004.

Equations of motion

\[\begin{aligned} x(t) &= r x(t-1)( 1 - x(t-1)^2 ) e^{-x(t-1)^2} \\ y(t) &= r y(t-1)( 1 - y(t-1)^2 ) e^{-y(t-1)^2} + c_{xy} x(t-1) \\ z(t) &= r z(t-1)( 1 - z(t-1)^2 ) e^{-z(t-1)^2} + c_{xz} x(t-1) + c_{yz} y(t-1) \end{aligned}\]

The parameters r, c_xy and c_yz do not appear in the original paper, but are added here for explorative purposes.

CausalityTools.ChaoticNoisyLinear2Type
ChaoticNoisyLinear2 <: DiscreteDefinition
ChaoticNoisyLinear2(; xi = [0.1, 0.2], c = 0.5,
    nx = Normal(0, 0.05), ny = Normal(0, 0.05),
    rng = Random.default_rng())

A bivariate system of two chaotic maps that are linearly coupled from x → y with coupling strength c.

Definition

\[\begin{align*} x(t+1) &= 3.4 x(t) (1 - x(t)^2) e^{-x(t)^2} + 0.8x(t-1) + \xi_x \\ y(t+1) &= 3.4 y(t) (1 - y(t)^2) e^{-y(t)^2} + 0.8y(t-1) + \xi_y + c x(t-2) \end{align*}\]

Process noise $\xi_x$ and $\xi_y$ is drawn at each iteration from nx and ny.

CausalityTools.ChuaCircuitsBidir6Type
ChuaCircuitsBidir6 <: ContinuousDefinition
ChuaCircuitsBidir6(;u₀ = [0.1, 0.1, 0.2, 0.15, 0.15, 0.22],
    α₁ = 7.0, α₂ = 7.0, β₁ = 14.286, β₂ = 14.286,
    F₁ = 1.5, F₂ = 1.5, ω₁ = 3.0, ω₂ = 3.0,
    n1 = Normal(0, 0.1),
    n2 = Normal(0, 0.1),
    c12 = 0.1, c21 = 0.1, m₀ = -1/7, m₁ = 2/7)

Initialize a bidirectionally coupled system consisting of two driven Chua circuits, X₁ and X₂ Murali1993.

Description

The subsystems are mutually coupled by a linear resistor, where ϵ12 controls the influence of X₁ on X₂, and c21 controls the influence of X₂ on X₁. The parameters for the subsystems are set equal to each other, as in the original paper, but can here be tuned individually for each subsystem.

\[\begin{align*} \dfrac{dx_1}{dt} &= \alpha_1(y_1, h(x_1)) - \alpha_1 \epsilon(x_1 - x_2) \\ \dfrac{dy_1}{dt} &= x_1 - y_1 + z_1 \\ \dfrac{dz_1}{dt} &= -\beta_1 y_1 + F_1 sin(\omega_1 t) + \epsilon_1 \\ \dfrac{dx_2}{dt} &= \alpha_2 (y_2, h(x_2)) - \alpha_2 c_{12}(x_1 - x_2) \\ \dfrac{dy_2}{dt} &= x_2 - y_2 + z_2 \\ \dfrac{dz_2}{dt} &= -\beta_2 y_2 + F_2 sin(\omega_2 t) + \epsilon_2, \end{align*}\]

where $h(x) = M_1x + 0.5(M_0 - M_1)(|x+1| - |x - 1|)$ and $\epsilon_1, \epsilon_2$ are noise terms that at each integration step is drawn independently from the normal distributions n1 and n2, respectively.

CausalityTools.ChuaScrollSine3Type
ChuaScrollSine3 <: ContinuousDefinition
ChuaScrollSine3(; xi = [0.1, 0.2, 0.3],
    α = 10.814, β = 14, γ = 0, a = 1.3, b = 0.11, c = 2,
    nx = Normal(0.0, 0.01),
    ny = Normal(0.0, 0.01)
    nz = Normal(0.0, 0.01))

An adjusted Chua system giving rise to n-scroll attractors Tang2001.

Description

The dynamics is generated by the following vector field

\[\begin{align*} \dot{x} &= \alpha (y - fx) + \eta x \\ \dot{y} &= x - y + z + \eta y \\ \dot{z} &= -\beta y - \gamma z + \eta z \end{align*}\]

where $\eta x$, $\eta z$, and $\eta z$ are drawn independently from normal distributions nx, ny and nz each iteration.

$fx$ is given by the following conditions:

n::Int = c + 1

if x >= 2*a*c
    fx = (b*pi/2*a)*(x - 2*a*c)
elseif -2*a*c < x < 2*a*c
    d = ifelse(isodd(n), pi, 0)
    fx = -b*sin((pi*x/2*a) + d)
elseif x <= -2*a*c
    fx = (b*pi/2*a)*(x + 2*a*c)
end
CausalityTools.ContingencyType
Contingency <: ProbabilitiesEstimator
Contingency(est::Union{ProbabilitiesEstimator, Nothing} = nothing)

Contingency is a probabilities estimator that transforms input data to a multidimensional probability mass function (internally represented as ContingencyMatrix.

It works directly on raw discrete/categorical data. Alternatively, if a ProbabilitiesEstimator est for which marginal_encodings is implemented is given, then input data are first discretized before creating the contingency matrix.

Note

The Contingency estimator differs from other ProbabilitiesEstimators in that it's not compatible with probabilities and other methods. Instead, it is used to construct ContingencyMatrix, from which probabilities can be computed.

CausalityTools.ContingencyMatrixType
ContingencyMatrix{T, N} <: Probabilities{T, N}
ContingencyMatrix(frequencies::AbstractArray{Int, N})

A contingency matrix is essentially a multivariate analogue of Probabilities that also keep track of raw frequencies.

The contingency matrix can be constructed directyly from an N-dimensional frequencies array. Alternatively, the contingency_matrix function performs counting for you; this works on both raw categorical data, or by first discretizing data using a a ProbabilitiesEstimator.

Description

A ContingencyMatrix c is just a simple wrapper around around AbstractArray{T, N}. Indexing c with multiple indices i, j, … returns the (i, j, …)th element of the empirical probability mass function (pmf). The following convencience methods are defined:

  • frequencies(c; dims) returns the multivariate raw counts along the given `dims (default to all available dimensions).
  • probabilities(c; dims) returns a multidimensional empirical probability mass function (pmf) along the given dims (defaults to all available dimensions), i.e. the normalized counts.
  • probabilities(c, i::Int) returns the marginal probabilities for the i-th dimension.
  • outcomes(c, i::Int) returns the marginal outcomes for the i-th dimension.

Ordering

The ordering of outcomes are internally consistent, but we make no promise on the ordering of outcomes relative to the input data. This means that if your input data are x = rand(["yes", "no"], 100); y = rand(["small", "medium", "large"], 100), you'll get a 2-by-3 contingency matrix, but there currently no easy way to determine which outcome the i-j-th row/column of this matrix corresponds to.

Since ContingencyMatrix is intended for use in information theoretic methods that don't care about ordering, as long as the ordering is internally consistent, this is not an issue for practical applications in this package. This may change in future releases.

Usage

Contingency matrices is used in the computation of discrete versions of the following quantities:

CausalityTools.ConvergentCrossMappingType
ConvergentCrossMapping <: CrossmapMeasure
ConvergentCrossMapping(; d::Int = 2, τ::Int = -1, w::Int = 0,
    f = Statistics.cor, embed_warn = true)

The convergent cross mapping (CCM) measure Sugihara2012).

Specifies embedding dimension d, embedding lag τ to be used, as described below, with predict or crossmap. The Theiler window w controls how many temporal neighbors are excluded during neighbor searches (w = 0 means that only the point itself is excluded). f is a function that computes the agreement between observations and predictions (the default, f = Statistics.cor, gives the Pearson correlation coefficient).

Embedding

Let S(i) be the source time series variable and T(i) be the target time series variable. This version produces regular embeddings with fixed dimension d and embedding lag τ as follows:

\[( S(i), S(i+\tau), S(i+2\tau), \ldots, S(i+(d-1)\tau, T(i))_{i=1}^{N-(d-1)\tau}.\]

In this joint embedding, neighbor searches are performed in the subspace spanned by the first D-1 variables, while the last (D-th) variable is to be predicted.

With this convention, τ < 0 implies "past/present values of source used to predict target", and τ > 0 implies "future/present values of source used to predict target". The latter case may not be meaningful for many applications, so by default, a warning will be given if τ > 0 (embed_warn = false turns off warnings).

CausalityTools.CopulaType
Copula <: MutualInformationEstimator
Copula(; est = Kraskov(k = 5), exact = false)

A non-parametric copula-based mutual information estimator.

It is typically many times faster to compute mutual information using Copula than with other MutualInformationEstimators, DifferentialEntropyEstimators, or ProbabilitiesEstimators, because Copula only needs to compute the entropy of a single (multivariate) variable, whereas the other methods explicitly computes the entropy of several variables.

If exact == true, then the exact empirical cumulative distribution function (ECDF) is used to compute the empirical copula. If exact == false, then a fast sorting-based approximation to the exact ECDF is computed instead (this breaks ties arbitrarily, so be careful when applying it to categorical/integer-valued data).

Description

Assume we have two Dy-dimensional and Dy-dimensional input StateSpaceSets x and y, both containing N observations. We can define the Dx + Dy-dimensional joint StateSpaceSet D = [Dx Dy]. Copula returns the negative copula entropy of D, which is equal to the mutual information between Dx and Dy (Ma2011; Pál2010).

CausalityTools.CorrTestType
CorrTest <: IndependenceTest
CorrTest()

An independence test based correlation (for two variables) and partial correlation (for three variables) (Levy1978@; as described in Schmidt2018).

Uses PearsonCorrelation and PartialCorrelation internally.

Assumes that the input data are (multivariate) normally distributed. Then ρ(X, Y) = 0 implies X ⫫ Y and ρ(X, Y | 𝐙) = 0 implies X ⫫ Y | 𝐙.

Description

The null hypothesis is H₀ := ρ(X, Y | 𝐙) = 0. We use the approach in Levy & Narula (1978)Levy1978 and compute the Z-transformation of the observed (partial) correlation coefficient $\hat{\rho}_{XY|\bf{Z}}$:

\[Z(\hat{\rho}_{XY|\bf{Z}}) = \log\dfrac{1 + \hat{\rho}_{XY|\bf{Z}}}{1 - \hat{\rho}_{XY|\bf{Z}}}.\]

To test the null hypothesis against the alternative hypothesis H₁ := ρ(X, Y | 𝐙) > 0, calculate

\[\hat{Z} = \dfrac{1}{2}\dfrac{Z(\hat{\rho}_{XY|\bf{Z}}) - Z(0)}{\sqrt{1/(n - d - 3)}},\]

and compute the two-sided p-value (Schmidt et al., 2018)

\[p(X, Y | \bf{Z}) = 2(1 - \phi(\sqrt{n - d - 3}Z(\hat{\rho}_{XY|\bf{Z}}))),\]

where $d$ is the dimension of $\bf{Z}$ and $n$ is the number of samples. For the pairwise case, the procedure is identical, but set $\bf{Z} = \emptyset$.

Examples

CausalityTools.CorrTestResultType
CorrTestResult(pvalue, ρ, z)

A simple struct that holds the results of a CorrTest test: the (partial) correlation coefficient ρ, Fisher's z, and pvalue - the two-sided p-value for the test.

CausalityTools.CrossmapEstimatorType
CrossmapEstimator{LIBSIZES, RNG}

A parametric supertype for all cross-map estimators, which are used with predict and crossmap.

Because the type of the library may differ between estimators, and because RNGs from different packages may be used, subtypes must implement the LIBSIZES and RNG type parameters.

For efficiency purposes, subtypes may contain mutable containers that can be re-used for ensemble analysis (see Ensemble).

Libraries

A cross-map estimator uses the concept of "libraries". A library is essentially just a reference to a set of points, and usually, a library refers to indices of points, not the actual points themselves.

For example, for timeseries, RandomVectors(libsizes = 50:25:100) produces three separate libraries, where the first contains 50 randomly selected time indices, the second contains 75 randomly selected time indices, and the third contains 100 randomly selected time indices. This of course assumes that all quantities involved can be indexed using the same time indices, meaning that the concept of "library" only makes sense after relevant quantities have been jointly embedded, so that they can be jointly indexed. For non-instantaneous prediction, the maximum possible library size shrinks with the magnitude of the index/time-offset for the prediction.

For spatial analyses (not yet implemented), indices could be more complex and involve multi-indices.

CausalityTools.DefinitionType

A generic supertype for definitions of information measures (one measure may have multiple definitions).

CausalityTools.DiksFangType
DiksFang <: BandwidthRule
DickFang(c = 4.8)

A rule of thumb giving the bandwidth $h$ for kernel density estimation of entropy as $h = cn^{-\dfrac{2}{7}}$, where $n$ is the number of data points (Diks & Fang, 2017)[DiksFang2017].

[DiksFang2017]: Diks, C., & Fang, H. (2017). Detecting Granger Causality with a Nonparametric Information-Based Statistic (No. 17-03). CeNDEF Working Paper.

CausalityTools.DistanceCorrelationType
DistanceCorrelation

The distance correlation (Székely et al., 2007)[Székely2007] measure quantifies potentially nonlinear associations between pairs of variables. If applied to three variables, the partial distance correlation (Székely and Rizzo, 2014)[Székely2014] is computed.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise dependence.
  • Use with distance_correlation to compute the raw distance correlation coefficient.
Warn

A partial distance correlation distance_correlation(X, Y, Z) = 0 doesn't always guarantee conditional independence X ⫫ Y | Z. See Székely and Rizzo (2014) for in-depth discussion.

CausalityTools.EmbeddingTEType
EmbeddingTE(; dS = 1, dT = 1, dTf = 1, dC = 1, τS = -1, τT = -1, ηTf = 1, τC = -1)
EmbeddingTE(opt::OptimiseTraditional, s, t, [c])

EmbeddingTE provide embedding parameters for transfer entropy analysis using either TEShannon, TERenyi, or in general any subtype of TransferEntropy, which in turns dictates the embedding used with transferentropy.

The second method finds parameters using the "traditional" optimised embedding techniques from DynamicalSystems.jl

Convention for generalized delay reconstruction

We use the following convention. Let $s(i)$ be time series for the source variable, $t(i)$ be the time series for the target variable and $c(i)$ the time series for the conditional variable. To compute transfer entropy, we need the following marginals:

\[\begin{aligned} T^{+} &= \{t(i+\eta^1), t(i+\eta^2), \ldots, (t(i+\eta^{d_{T^{+}}}) \} \\ T^{-} &= \{ (t(i+\tau^0_{T}), t(i+\tau^1_{T}), t(i+\tau^2_{T}), \ldots, t(t + \tau^{d_{T} - 1}_{T})) \} \\ S^{-} &= \{ (s(i+\tau^0_{S}), s(i+\tau^1_{S}), s(i+\tau^2_{S}), \ldots, s(t + \tau^{d_{S} - 1}_{S})) \} \\ C^{-} &= \{ (c(i+\tau^0_{C}), c(i+\tau^1_{C}), c(i+\tau^2_{C}), \ldots, c(t + \tau^{d_{C} - 1}_{C})) \} \end{aligned}\]

Depending on the application, the delay reconstruction lags $\tau^k_{T} \leq 0$, $\tau^k_{S} \leq 0$, and $\tau^k_{C} \leq 0$ may be equally spaced, or non-equally spaced. The same applied to the prediction lag(s), but typically only a only a single predictions lag $\eta^k$ is used (so that $d_{T^{+}} = 1$).

For transfer entropy, traditionally at least one $\tau^k_{T}$, one $\tau^k_{S}$ and one $\tau^k_{C}$ equals zero. This way, the $T^{-}$, $S^{-}$ and $C^{-}$ marginals always contains present/past states, while the $\mathcal T$ marginal contain future states relative to the other marginals. However, this is not a strict requirement, and modern approaches that searches for optimal embeddings can return embeddings without the intantaneous lag.

Combined, we get the generalized delay reconstruction $\mathbb{E} = (T^{+}_{(d_{T^{+}})}, T^{-}_{(d_{T})}, S^{-}_{(d_{S})}, C^{-}_{(d_{C})})$. Transfer entropy is then computed as

\[\begin{aligned} TE_{S \rightarrow T | C} = \int_{\mathbb{E}} P(T^{+}, T^-, S^-, C^-) \log_{b}{\left(\frac{P(T^{+} | T^-, S^-, C^-)}{P(T^{+} | T^-, C^-)}\right)}, \end{aligned}\]

or, if conditionals are not relevant,

\[\begin{aligned} TE_{S \rightarrow T} = \int_{\mathbb{E}} P(T^{+}, T^-, S^-) \log_{b}{\left(\frac{P(T^{+} | T^-, S^-)}{P(T^{+} | T^-)}\right)}, \end{aligned}\]

Here,

  • $T^{+}$ denotes the $d_{T^{+}}$-dimensional set of vectors furnishing the future states of $T$ (almost always equal to 1 in practical applications),
  • $T^{-}$ denotes the $d_{T}$-dimensional set of vectors furnishing the past and present states of $T$,
  • $S^{-}$ denotes the $d_{S}$-dimensional set of vectors furnishing the past and present of $S$, and
  • $C^{-}$ denotes the $d_{C}$-dimensional set of vectors furnishing the past and present of $C$.

Keyword arguments

  • dS, dT, dC, dTf (f for future) are the dimensions of the $S^{-}$, $T^{-}$, $C^{-}$ and $T^{+}$ marginals. The parameters dS, dT, dC and dTf must each be a positive integer number.
  • τS, τT, τC are the embedding lags for $S^{-}$, $T^{-}$, $C^{-}$. Each parameter are integers ∈ 𝒩⁰⁻, or a vector of integers ∈ 𝒩⁰⁻, so that $S^{-}$, $T^{-}$, $C^{-}$ always represents present/past values. If e.g. τT is an integer, then for the $T^-$ marginal is constructed using lags $\tau_{T} = \{0, \tau, 2\tau, \ldots, (d_{T}- 1)\tau_T \}$. If is a vector, e.g. τΤ = [-1, -5, -7], then the dimension dT must match the lags, and precisely those lags are used: $\tau_{T} = \{-1, -5, -7 \}$.
  • The prediction lag(s) ηTf is a positive integer. Combined with the requirement that the other delay parameters are zero or negative, this ensures that we're always predicting from past/present to future. In typical applications, ηTf = 1 is used for transfer entropy.

Examples

Say we wanted to compute the Shannon transfer entropy $TE^S(S \to T) = I^S(T^+; S^- | T^-)$. Using some modern procedure for determining optimal embedding parameters using methods from DynamicalSystems.jl, we find that the optimal embedding of $T^{-}$ is three-dimensional and is given by the lags [0, -5, -8]. Using the same procedure, we find that the optimal embedding of $S^{-}$ is two-dimensional with lags $[-1, -8]$. We want to predicting a univariate version of the target variable one time step into the future (ηTf = 1). The total embedding is then the set of embedding vectors

$E_{TE} = \{ (T(i+1), S(i-1), S(i-8), T(i), T(i-5), T(i-8)) \}$. Translating this to code, we get:

using CausalityTools
julia> EmbeddingTE(dT=3, τT=[0, -5, -8], dS=2, τS=[-1, -4], ηTf=1)

# output
EmbeddingTE(dS=2, dT=3, dC=1, dTf=1, τS=[-1, -4], τT=[0, -5, -8], τC=-1, ηTf=1)
CausalityTools.ExpandingSegmentType
ExpandingSegment <: CrossmapEstimator
ExpandingSegment(; libsizes::Int, rng = Random.default_rng())

Indicatates that cross mapping is performed on a contiguous time series segment/window, starting from the first available data point up to the Lth data point.

If used in an ensemble setting, the estimator is applied to time indices Lmin:step:Lmax of the joint embedding.

CausalityTools.FPVPType
FPVP <: ConditionalMutualInformationEstimator
FPVP(k = 1, w = 0)

The Frenzel-Pompe-Vejmelka-Paluš (or FPVP for short) estimator is used to estimate the differential conditional mutual information using a k-th nearest neighbor approach that is analogous to that of the KraskovStögbauerGrassberger1 mutual information estimator (Frenzel2007; Vejmelka2008).

w is the Theiler window, which controls the number of temporal neighbors that are excluded during neighbor searches.

CausalityTools.GaoKannanOhViswanathType
GaoKannanOhViswanath <: MutualInformationEstimator
GaoKannanOhViswanath(; k = 1, w = 0)

The GaoKannanOhViswanath (Shannon) estimator is designed for estimating mutual information between variables that may be either discrete, continuous or a mixture of both GaoKannanOhViswanath2017.

Explicitly convert your discrete data to floats

Even though the GaoKannanOhViswanath estimator is designed to handle discrete data, our implementation demands that all input data are StateSpaceSets whose data points are floats. If you have discrete data, such as strings or symbols, encode them using integers and convert those integers to floats before passing them to mutualinfo.

Description

The estimator starts by expressing mutual information in terms of the Radon-Nikodym derivative, and then estimates these derivatives using k-nearest neighbor distances from empirical samples.

The estimator avoids the common issue of having to add noise to data before analysis due to tied points, which may bias other estimators. Citing their paper, the estimator "strongly outperforms natural baselines of discretizing the mixed random variables (by quantization) or making it continuous by adding a small Gaussian noise."

Implementation note

In GaoKannanOhViswanath2017, they claim (roughly speaking) that the estimator reduces to the KraskovStögbauerGrassberger1 estimator for continuous-valued data. However, KraskovStögbauerGrassberger1 uses the digamma function, while GaoKannanOhViswanath uses the logarithm instead, so the estimators are not exactly equivalent for continuous data.

Moreover, in their algorithm 1, it is clearly not the case that the method falls back on the KSG1 approach. The KSG1 estimator uses k-th neighbor distances in the joint space, while the GaoKannanOhViswanath algorithm selects the maximum k-th nearest distances among the two marginal spaces, which are in general not the same as the k-th neighbor distance in the joint space (unless both marginals are univariate). Therefore, our implementation here differs slightly from algorithm 1 in GaoKannanOhViswanath. We have modified it in a way that mimics KraskovStögbauerGrassberger1 for continous data. Note that because of using the log function instead of digamma, there will be slight differences between the methods. See the source code for more details.

See also: mutualinfo.

CausalityTools.GaoOhViswanathType
GaoOhViswanath <: MutualInformationEstimator

The GaoOhViswanath mutual information estimator, also called the bias-improved-KSG estimator, or BI-KSG, by Gao2018, is given by

\[\begin{align*} \hat{H}_{GAO}(X, Y) &= \hat{H}_{KSG}(X) + \hat{H}_{KSG}(Y) - \hat{H}_{KZL}(X, Y) \\ &= \psi{(k)} + \log{(N)} + \log{ \left( \dfrac{c_{d_{x}, 2} c_{d_{y}, 2}}{c_{d_{x} + d_{y}, 2}} \right) } - \\ & \dfrac{1}{N} \sum_{i=1}^N \left( \log{(n_{x, i, 2})} + \log{(n_{y, i, 2})} \right) \end{align*},\]

where $c_{d, 2} = \dfrac{\pi^{\frac{d}{2}}}{\Gamma{(\dfrac{d}{2} + 1)}}$ is the volume of a $d$-dimensional unit $\mathcal{l}_2$-ball.

CausalityTools.GaussianCMIType
GaussianCMI <: MutualInformationEstimator
GaussianCMI(; normalize::Bool = false)

GaussianCMI is a parametric estimator for Shannon conditional mutual information (CMI) Vejmelka2008.

Description

GaussianCMI estimates Shannon CMI through a sum of two mutual information terms that each are estimated using GaussianMI (the normalize keyword is the same as for GaussianMI):

\[\hat{I}_{Gaussian}(X; Y | Z) = \hat{I}_{Gaussian}(X; Y, Z) - \hat{I}_{Gaussian}(X; Z)\]

CausalityTools.GaussianMIType
GaussianMI <: MutualInformationEstimator
GaussianMI(; normalize::Bool = false)

GaussianMI is a parametric estimator for Shannon mutual information.

Description

Given $d_x$-dimensional and $d_y$-dimensional input data X and Y, GaussianMI first constructs the $d_x + d_y$-dimensional joint StateSpaceSet XY. If normalize == true, then we follow the approach in Vejmelka & Palus (2008)Vejmelka2008 and transform each column in XY to have zero mean and unit standard deviation. If normalize == false, then the algorithm proceeds without normalization.

Next, the C_{XY}, the correlation matrix for the (normalized) joint data XY is computed. The mutual information estimate GaussianMI assumes the input variables are distributed according to normal distributions with zero means and unit standard deviations. Therefore, given $d_x$-dimensional and $d_y$-dimensional input data X and Y, GaussianMI first constructs the joint StateSpaceSet XY, then transforms each column in XY to have zero mean and unit standard deviation, and finally computes the \Sigma, the correlation matrix for XY.

The mutual information estimated (for normalize == false) is then estimated as

\[\hat{I}^S_{Gaussian}(X; Y) = \dfrac{1}{2} \dfrac{ \det(\Sigma_X) \det(\Sigma_Y)) }{\det(\Sigma))},\]

where we $\Sigma_X$ and $\Sigma_Y$ appear in $\Sigma$ as

\[\Sigma = \begin{bmatrix} \Sigma_{X} & \Sigma^{'}\\ \Sigma^{'} & \Sigma_{Y} \end{bmatrix}.\]

If normalize == true, then the mutual information is estimated as

\[\hat{I}^S_{Gaussian}(X; Y) = -\dfrac{1}{2} \sum_{i = 1}^{d_x + d_y} \sigma_i,\]

where $\sigma_i$ are the eigenvalues for $\Sigma$.

CausalityTools.GenericKernelType
GenericKernel <: DifferentialEntropyEstimator
GenericKernel(bandwidth = Silverman(), kernel::MultivariateKernel = NormalIsotropic())

A generic, multivariate plug-in estimator for entropies based on kernel density estimation (KDE) that can in principle be used to compute any differential entropy.

Data should be standardized to zero mean and unit variance before applying GenericKernel.

The bandwidth may be set manually, or to one of the rule-of-thumbs listed below.

Description

Assume we have samples $\{x_1, x_2, \ldots, x_N \}$ from a continuous random variable $X \in \mathbb{R}^d$ with support $\mathcal{X}$ and density function $f : \mathbb{R}^d \to \mathbb{R}$.

GenericKernel estimates, for each $x_i$ in the sample, the point-wise densities $\hat{f}(x_i)$ using the given kernel and bandwidth, then computes a resubstitution estimate for the entropy. We support the following resubstitution estimates.

Shannon differential entropy

\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))]\]

is estimated by replacing the expectation with the sample average (Diks2017)

\[\hat{H}(X) = -\dfrac{1}{N}\sum_{i = 1}^N \log \hat{f}(x).\]

Compatible kernels

Bandwidth rule-of-thumbs

Diks2017. Diks, C., & Fang, H. (2017). Transfer entropy for nonparametric granger causality detection: an evaluation of different resampling methods. Entropy, 19(7), 372.

CausalityTools.GraphAlgorithmType
GraphAlgorithm

The supertype of all causal graph inference algorithms.

Concrete implementations

  • OCE. The optimal causation entropy algorithm for time series graphs.
CausalityTools.HMeasureType
HMeasure <: AssociationMeasure
HMeasure(; K::Int = 2, dx = 2, dy = 2, τx = - 1, τy = -1, w = 0)

The HMeasure Arnhold1999 is a pairwise association measure. It quantifies the probability with which close state of a target timeseries/embedding are mapped to close states of a source timeseries/embedding.

Note that τx and τy are negative by convention. See docstring for SMeasure for an explanation.

Usage

  • Use with independence to perform a formal hypothesis test for directional dependence.
  • Use with h_measure to compute the raw h-measure statistic.

Description

The HMeasure Arnhold1999 is similar to the SMeasure, but the numerator of the formula is replaced by $R_i(x)$, the mean squared Euclidean distance to all other points, and there is a $\log$-term inside the sum:

\[H^{(k)}(x|y) = \dfrac{1}{N} \sum_{i=1}^{N} \log \left( \dfrac{R_i(x)}{R_i^{(k)}(x|y)} \right).\]

Parameters are the same and $R_i^{(k)}(x|y)$ is computed as for SMeasure.

CausalityTools.Henon2Type
Henon2() <: DiscreteDefinition
Henon2(;u₀ = [0.1, 0.2, 0.2, 0.3], c_xy = 2.0)

A bivariate system consisting of two identical 1D Henon maps with unidirectional forcing $X \to Y$ (Krakovská et al., 2018)[Krakovská2018].

Equations of motion

The equations of motion are

\[\begin{align*} x_1(t+1) &= 1.4 - x_1^2(t) + 0.3x_2(t) \\ x_2(t+1) &= x_1(t) \\ y_1(t+1) &= 1.4 - [c_{xy} x_1(t) y_1(t) + (1-c_{xy}) y_1^2(t)] + 0.3 y_2(t) \\ y_2(t+1) &= y_1(t) \end{align*}\]

CausalityTools.Henon3Type
Henon3() <: DiscreteDefinition
Henon3(; a = 0.1, b = 0.3, c = 0.1, xi = [0.1, 0.2, 0.3])

Henon3 is a DiscreteDefinition definition for a lagged discrete dynamical system consisting of three coupled 1D Henon maps Papana2013.

Equations of motion

\[\begin{align*} x_1(t+1) &= a - x_1(t)^2 + b x_1(t-2) \\ x_2(t+1) &= a - c x_1(t) x_2(t)- (1 - c) x_2(t)^2 + b x_2(t-1) \\ x_3(t+1) &= c x_2(t) x_3(t) - (1 - c) x_3(t)^2 + b x_3(t-1) \end{align*}\]

Here $c$ is the coupling constant. The system becomes completely synchronized for $c >= 0.7$. The initial condition xi is repeated over the first two time steps before iteration starts.

CausalityTools.HilbertType
Hilbert(est;
    source::InstantaneousSignalProperty = Phase(),
    target::InstantaneousSignalProperty = Phase(),
    cond::InstantaneousSignalProperty = Phase())
) <: TransferDifferentialEntropyEstimator

Compute transfer entropy on instantaneous phases/amplitudes of relevant signals, which are obtained by first applying the Hilbert transform to each signal, then extracting the phases/amplitudes of the resulting complex numbers Palus2014. Original time series are thus transformed to instantaneous phase/amplitude time series. Transfer entropy is then estimated using the provided est on those phases/amplitudes (use e.g. VisitationFrequency, or SymbolicPermutation).

Info

Details on estimation of the transfer entropy (conditional mutual information) following the phase/amplitude extraction step is not given in Palus (2014). Here, after instantaneous phases/amplitudes have been obtained, these are treated as regular time series, from which transfer entropy is then computed as usual.

See also: Phase, Amplitude.

CausalityTools.HindmarshRose3Type
HindmarshRose3 <: ContinuousDefinition
HindmarshRose3(; xi = [0.1, 0.2, 0.3, p)

Initialise a Hindmarsh-Rose system, which is a model of neuronal spiking.

Description

\[\begin{align*} \dfrac{dx}{dt} &= y + \phi(x) - z + I \\ \dfrac{dy}{dt} &= \psi(x) - y \\ \dfrac{dz}{dt} &= r[s(x - x_R) - z], \end{align*}\]

where

\[\begin{aligned} \phi(x) &= -ax^3+bx^2 \psi(x) &= c - dx^2 \end{aligned}\]

If parameters other than the defaults are to be used, they must be provided as a vector [a, b, c, d, r, s, xᵣ, I].

CausalityTools.Ikeda2Type
Ikeda2 <: DiscreteDefinition
Ikeda2(; xi = [0.19, 0.21], c_xy = 1.0, c_yx = 1.0, a = 0.8, b = 12, c = 0.9,
    r₁ = 0.2, r₂ = 0.15, σ = 0.05, rng = Random.default_rng())

Initialise a discrete two-dimensional Ikeda map system, adapted from Cao1997, by adding a noise term and allowing the influences from $x \to y$ ($c_{xy}$) and from $y \to x$ ($c_{yx}$) to be adjusted.

The difference equations are

\[\begin{align*} x(t+1) &= 1 + \mu(x \cos{(\theta)} - c_{yx} y \sin{(\theta)}) - min( \dfrac{\sigma \xi_{t}^{(1)}}{(1-x)}, \xi_{t}^{(2)}) \\ y(t+1) &= \mu(y \cos{(\theta)} - c_{xy} x \sin{(\theta)}) - min(\dfrac{\sigma \zeta_{t}^{(1)}}{(1-y)}, \zeta_{t}^{(2)}) \end{align*}\]

CausalityTools.InformationMeasureType
InformationMeasure <: AssociationMeasure

The supertype for all definitions of information-based measures.

Why use definitions?

Several information measures, such as mutual information, come in several forms depending on what type of generalized entropy they are defined with respect to. For example, there are at least three forms of Rényi mutual informations.

In CausalityTools.jl, each unique variant of a measure is a subtype of InformationMeasure. For example, MITsallisFuruichi gives the formula for Furuichi (2006)'s Rényi-based mutual information.

Implemented measures

Mutual informations

  • MIShannon. Discrete Shannon mutual information.
  • MITsallisFuruichi. Discrete Tsallis mutual information, as defined by Furuichi (2006).

Conditional mutual information (CMI)

CausalityTools.JDDTestResultType
JDDTestResult(Δjdd, hypothetical_μ, pvalue)

Holds the results of JointDistanceDistributionTest. Δjdd is the Δ-distribution, hypothetical_μ is the hypothetical mean of the Δ-distribution under the null, and pvalue is the p-value for the one-sided t-test.

CausalityTools.JointDistanceDistributionType
JointDistanceDistribution <: AssociationMeasure end
JointDistanceDistribution(; metric = Euclidean(), B = 10, D = 2, τ = -1, μ = 0.0)

The joint distance distribution (JDD) measure Amigo2018.

Usage

  • Use with independence to perform a formal hypothesis test for directional dependence.
  • Use with jdd to compute the joint distance distribution Δ from Amigo2018

Keyword arguments

  • distance_metric::Metric: An instance of a valid distance metric from Distances.jl. Defaults to Euclidean().
  • B::Int: The number of equidistant subintervals to divide the interval [0, 1] into when comparing the normalised distances.
  • D::Int: Embedding dimension.
  • τ::Int: Embedding delay. By convention, τ is negative.
  • μ: The hypothetical mean value of the joint distance distribution if there is no coupling between x and y (default is μ = 0.0).

Description

From input time series $x(t)$ and $y(t)$, we first construct the delay embeddings (note the positive sign in the embedding lags; therefore the input parameter τ is by convention negative).

\[\begin{align*} \{\bf{x}_i \} &= \{(x_i, x_{i+\tau}, \ldots, x_{i+(d_x - 1)\tau}) \} \\ \{\bf{y}_i \} &= \{(y_i, y_{i+\tau}, \ldots, y_{i+(d_y - 1)\tau}) \} \\ \end{align*}\]

The algorithm then proceeds to analyze the distribution of distances between points of these embeddings, as described in Amigo2018.

Examples

CausalityTools.JointDistanceDistributionTestType
JointDistanceDistributionTest <: IndependenceTest
JointDistanceDistributionTest(measure::JointDistanceDistribution; rng = Random.default_rng())

An independence test for two variables based on the JointDistanceDistribution Amigo2018.

When used with independence, a JDDTestResult is returned.

Description

The joint distance distribution (labelled Δ in their paper) is used by Amigó & Hirata (2018) to detect directional couplings of the form $X \to Y$ or $Y \to X$. JointDistanceDistributionTest formulates their method as an independence test.

Formally, we test the hypothesis $H_0$ (the variables are independent) against $H_1$ (there is directional coupling between the variables). To do so, we use a right-sided/upper-tailed t-test to check mean of Δ is skewed towards positive value, i.e.

  • $H_0 := \mu(\Delta) = 0$
  • $H_1 := \mu(\Delta) > 0$.

When used with independence, a JDDTestResult is returned, which contains the joint distance distribution and a p-value. If you only need Δ, use jdd directly.

Examples

This example shows how the JointDistanceDistributionTest can be used in practice.

CausalityTools.KraskovStögbauerGrassberger1Type
KSG1 <: MutualInformationEstimator
KraskovStögbauerGrassberger1 <: MutualInformationEstimator
KraskovStögbauerGrassberger1(; k::Int = 1, w = 0, metric_marginals = Chebyshev())

The KraskovStögbauerGrassberger1 mutual information estimator (you can use KSG1 for short) is the $I^{(1)}$ k-th nearest neighbor estimator from Kraskov2004.

Keyword arguments

  • k::Int: The number of nearest neighbors to consider. Only information about the k-th nearest neighbor is actually used.
  • metric_marginals: The distance metric for the marginals for the marginals can be any metric from Distances.jl. It defaults to metric_marginals = Chebyshev(), which is the same as in Kraskov et al. (2004).
  • w::Int: The Theiler window, which determines if temporal neighbors are excluded during neighbor searches in the joint space. Defaults to 0, meaning that only the point itself is excluded.

Description

Let the joint StateSpaceSet $X := \{\bf{X}_1, \bf{X_2}, \ldots, \bf{X}_m \}$ be defined by the concatenation of the marginal StateSpaceSets $\{ \bf{X}_k \}_{k=1}^m$, where each $\bf{X}_k$ is potentially multivariate. Let $\bf{x}_1, \bf{x}_2, \ldots, \bf{x}_N$ be the points in the joint space $X$.

CausalityTools.KraskovStögbauerGrassberger2Type
KSG2 <: MutualInformationEstimator
KraskovStögbauerGrassberger2 <: MutualInformationEstimator
KraskovStögbauerGrassberger2(; k::Int = 1, w = 0, metric_marginals = Chebyshev())

The KraskovStögbauerGrassberger2 mutual information estimator (you can use KSG2 for short) is the $I^{(2)}$ k-th nearest neighbor estimator from Kraskov2004.

Keyword arguments

  • k::Int: The number of nearest neighbors to consider. Only information about the k-th nearest neighbor is actually used.
  • metric_marginals: The distance metric for the marginals for the marginals can be any metric from Distances.jl. It defaults to metric_marginals = Chebyshev(), which is the same as in Kraskov et al. (2004).
  • w::Int: The Theiler window, which determines if temporal neighbors are excluded during neighbor searches in the joint space. Defaults to 0, meaning that only the point itself is excluded.

Description

Let the joint StateSpaceSet $X := \{\bf{X}_1, \bf{X_2}, \ldots, \bf{X}_m \}$ be defined by the concatenation of the marginal StateSpaceSets $\{ \bf{X}_k \}_{k=1}^m$, where each $\bf{X}_k$ is potentially multivariate. Let $\bf{x}_1, \bf{x}_2, \ldots, \bf{x}_N$ be the points in the joint space $X$.

The KraskovStögbauerGrassberger2 estimator first locates, for each $\bf{x}_i \in X$, the point $\bf{n}_i \in X$, the k-th nearest neighbor to $\bf{x}_i$, according to the maximum norm (Chebyshev metric). Let $\epsilon_i$ be the distance $d(\bf{x}_i, \bf{n}_i)$.

Consider $x_i^m \in \bf{X}_m$, the $i$-th point in the marginal space $\bf{X}_m$. For each $\bf{x}_i^m$, we determine $\theta_i^m$ := the number of points $\bf{x}_k^m \in \bf{X}_m$ that are a distance less than $\epsilon_i$ away from $\bf{x}_i^m$. That is, we use the distance from a query point $\bf{x}_i \in X$ (in the joint space) to count neighbors of $x_i^m \in \bf{X}_m$ (in the marginal space).

Mutual information between the variables $\bf{X}_1, \bf{X_2}, \ldots, \bf{X}_m$ is then estimated as

\[\hat{I}_{KSG2}(\bf{X}) = \psi{(k)} - \dfrac{m - 1}{k} + (m - 1)\psi{(N)} - \dfrac{1}{N} \sum_{i = 1}^N \sum_{j = 1}^m \psi{(\theta_i^j + 1)}\]

CausalityTools.LMeasureType
LMeasure <: AssociationMeasure
LMeasure(; K::Int = 2, dx = 2, dy = 2, τx = - 1, τy = -1, w = 0)

The LMeasure Chicharro2009 is a pairwise association measure. It quantifies the probability with which close state of a target timeseries/embedding are mapped to close states of a source timeseries/embedding.

Note that τx and τy are negative by convention. See docstring for SMeasure for an explanation.

Usage

  • Use with independence to perform a formal hypothesis test for directional dependence.
  • Use with l_measure to compute the raw l-measure statistic.

Description

LMeasure is similar to MMeasure, but uses distance ranks instead of the raw distances.

Let $\bf{x_i}$ be an embedding vector, and let $g_{i,j}$ denote the rank that the distance between $\bf{x_i}$ and some other vector $\bf{x_j}$ in a sorted ascending list of distances between $\bf{x_i}$ and $\bf{x_{i \neq j}}$ In other words, $g_{i,j}$ this is just the $N-1$ nearest neighbor distances sorted )

LMeasure is then defined as

\[L^{(k)}(x|y) = \dfrac{1}{N} \sum_{i=1}^{N} \log \left( \dfrac{G_i(x) - G_i^{(k)}(x|y)}{G_i(x) - G_i^k(x)} \right),\]

where $G_i(x) = \frac{N}{2}$ and $G_i^K(x) = \frac{k+1}{2}$ are the mean and minimal rank, respectively.

The $y$-conditioned mean rank is defined as

\[G_i^{(k)}(x|y) = \dfrac{1}{K}\sum_{j=1}^{K} g_{i,w_{i, j}},\]

where $w_{i,j}$ is the index of the $j$-th nearest neighbor of $\bf{y_i}$.

CausalityTools.LaggedDiscreteDefinitionType
LaggedDiscreteDefinition <: SystemDefinition

The supertype of definitions for discrete systems with lag larger than 1.

Why is this type needed? Ideally, an additional definition shouldn't be needed, because we should in principle be able to use DiscreteDynamicalSystem directly for all systems. However, DiscreteDynamicalSystem doesn't work for systems with memory beyond a single time lag. For example, autoregressive systems of order larger than one are not representable using DiscreteDynamicalSystem.

Concrete subtypes of DiscreteDefinition are parameter containers that are passed on to DiscreteDynamicalSystem. They allocate mutable containers that keep track of past states of state variables that require it. Use system to generate a DiscreteDynamicalSystem that can be used with trajectory.

Implementation details

Concrete implementations must fulfill the below criteria.

  • Subtypes must implement a past_states field, which is a SVector{N, MVector{L, Int}}, where N is the number of variables. For type stability, L states are tracked for all variables, even though the maximum lag may only occur for one of the variables.
  • The first type parameter of subtypes must be P, which keeps track of the type of past_states.

For an example, see the source code for Peguin2.

CausalityTools.LeonenkoProzantoSavaniType
LeonenkoProzantoSavani <: DifferentialEntropyEstimator
LeonenkoProzantoSavani(k = 1, w = 0)

The LeonenkoProzantoSavani estimator computes the Shannon, Renyi, or Tsallis entropy using the k-th nearest-neighbor approach from LeonenkoProsantoSavani2008.

w is the Theiler window, which determines if temporal neighbors are excluded during neighbor searches (defaults to 0, meaning that only the point itself is excluded when searching for neighbours).

CausalityTools.LindnerType
Lindner <: TransferEntropyEstimator
Lindner(k = 1, w = 0, base = 2)

The Lindner transfer entropy estimator Lindner2011, which is also used in the Trentool MATLAB toolbox, and is based on nearest neighbor searches.

w is the Theiler window, which determines if temporal neighbors are excluded during neighbor searches (defaults to 0, meaning that only the point itself is excluded when searching for neighbours).

The estimator can be used both for pairwise and conditional transfer entropy estimation.

Description

For a given points in the joint embedding space jᵢ, this estimator first computes the distance dᵢ from jᵢ to its k-th nearest neighbor. Then, for each point mₖ[i] in the k-th marginal space, it counts the number of points within radius dᵢ.

The transfer entropy is then computed as

\[TE(X \to Y) = \psi(k) + \dfrac{1}{N} \sum_{i}^n \left[ \sum_{k=1}^3 \left( \psi(m_k[i] + 1) \right) \right],\]

where the index k references the three marginal subspaces T, TTf and ST for which neighbor searches are performed. Here this estimator has been modified to allow for conditioning too (a simple modification to Lindner2011's equation 5 and 6).

CausalityTools.LocalLikelihoodType
LocalLikelihood <: ProbabilitiesEstimator
LocalLikelihood(k = 5, w = 0, metric = Euclidean())

The LocalLikelihood estimator estimates the density around a given query point by a Gaussian kernel informed by the local mean and covariance.

To form probabilities from the pointwise density estimates, the densities are simply sum-normalized to 1.

Outcome space

The outcome_space for LocalLikelihood is the set of input points.

CausalityTools.LocalPermutationTestType
LocalPermutationTest <: IndependenceTest
LocalPermutationTest(measure, [est];
    kperm::Int = 5,
    nshuffles::Int = 100,
    rng = Random.default_rng(),
    replace = true,
    w::Int = 0,
    show_progress = false)

LocalPermutationTest is a generic conditional independence test Runge2018LocalPerm for assessing whether two variables X and Y are conditionally independendent given a third variable Z (all of which may be multivariate).

When used with independence, a LocalPermutationTestResult is returned.

Description

This is a generic one-sided hypothesis test that checks whether X and Y are independent (given Z, if provided) based on resampling from a null distribution assumed to represent independence between the variables. The null distribution is generated by repeatedly shuffling the input data in some way that is intended to break any dependence between x and y, but preserve dependencies between x and z.

The algorithm is as follows:

  1. Compute the original conditional independence statistic I(X; Y | Z).
  2. Allocate a scalar valued vector with space for nshuffles elements.
  3. For k ∈ [1, 2, …, nshuffles], repeat
    • For each zᵢ ∈ Y, let nᵢ be time indices of the kperm nearest neighbors of zᵢ, excluding the w nearest neighbors of zᵢ from the neighbor query (i.e w is the Theiler window).
    • Let xᵢ⋆ = X[j], where j is randomly sampled from nᵢ with replacement. This way, xᵢ is replaced with xⱼ only if zᵢ ≈ zⱼ (zᵢ and zⱼ are close). Repeat for i = 1, 2, …, n and obtain the shuffled X̂ = [x̂₁, x̂₂, …, x̂ₙ].
    • Compute the conditional independence statistic Iₖ(X̂; Y | Z).
    • Let Î[k] = Iₖ(X̂; Y | Z).
  4. Compute the p-value as count(Î[k] .<= I) / nshuffles).

In additional to the conditional variant from Runge (2018), we also provide a pairwise version, where the shuffling procedure is identical, except neighbors in Y are used instead of Z and we I(X; Y) and Iₖ(X̂; Y) instead of I(X; Y | Z) and Iₖ(X̂; Y | Z).

Compatible measures

MeasurePairwiseConditionalRequires estNote
PartialCorrelationNo
DistanceCorrelationNo
CMIShannonYes
TEShannonYesPairwise tests not possible with TransferEntropyEstimators, only lower-level estimators, e.g. FPVP, GaussianMI or Kraskov
PMIYes

The LocalPermutationTest is only defined for conditional independence testing. Exceptions are for measures like TEShannon, which use conditional measures under the hood even for their pairwise variants, and are therefore compatible with LocalPermutationTest.

The nearest-neighbor approach in Runge (2018) can be reproduced by using the CMIShannon measure with the FPVP estimator.

Examples

CausalityTools.LocalPermutationTestResultType
LocalPermutationTestResult(m, m_surr, pvalue)

Holds the result of a LocalPermutationTest. m is the measure computed on the original data. m_surr is a vector of the measure computed on permuted data, where m_surr[i] is the measure compute on the i-th permutation. pvalue is the one-sided p-value for the test.

CausalityTools.LoftsGaardenType
Loftsgaarden <: ProbabilitiesEstimator

The Loftsgaarden probabilities estimator is based on the k-th nearest neighbor density estimatio from Loftsgaarden & Quesenberry (1965).

It estimates probabilities by first estimating the density locally at each sample point xᵢ using the distance from xᵢ to its k-th nearest neighbor. The density distribution over the sample points is then normalized to form probabilities.

Outcome space

The outcome space Ω for LoftsGaarden is the indices of the input data, 1:length(x). The reason to not return the data points themselves is because duplicate data points may not have same probabilities (due to having different neighbors).

CausalityTools.Logistic2BidirType
Logistic2Bidir() <: DiscreteDefinition
Logistic2Bidir(; ui = [0.5, 0.5], c_xy = 0.1, c_yx = 0.1, r₁ = 3.78, r₂ = 3.66,
    σ_xy = 0.05, σ_yx = 0.05,
    rng = Random.default_rng())

A bivariate system consisting of two 1D noisy logistic maps which are bidirectionally interacting Diego2019.

Equations of motion

\[\begin{align*} x(t+1) &= r_1 f_{yx}^{t}(1 - f_{yx}^{t}) \\ y(t+1) &= r_2 f_{xy}^{t}(1 - f_{xy}^{t}) \\ f_{xy}^t &= \dfrac{y(t) + c_{xy}(x(t) + \sigma_{xy} \xi_{xy}^t )}{1 + c_{xy} (1 + \sigma_{xy} )} \\ f_{yx}^t &= \dfrac{x(t) + c_{yx}(y(t) + \sigma_{yx} \xi_{yx}^t )}{1 + c_{yx} (1 + \sigma_{yx} )}, \end{align*}\]

Here, the coupling strength $c_{xy}$ controls how strongly species $x$ influences species $y$, and vice versa for $c_{yx}$. To simulate time-varying influence of unobserved processes, we use the dynamical noise terms $\xi_{xy}^t$ and $\xi_{yx}^t$, drawn from a uniform distribution with support on $[0, 1]$. If $\sigma_{xy} > 0$, then the influence of $x$ on $y$ is masked by dynamical noise equivalent to $\sigma_{xy} \xi_{xy}^{t}$ at the $t$-th iteration of the map, and vice versa for $\sigma_{yx}$.

CausalityTools.Logistic2UnidirType
Logistic2Unidir <: DiscreteDefinition
Logistic2Unidir(; xi = [0.5, 0.5], c_xy = 0.1, σ_xy = 0.05, r₁ = 3.78, r₂ = 3.66,
    rng = Random.default_rng())

A bivariate system consisting of two 1D noisy logistic maps which are undirectionally coupled x → y Diego2019.

Equations of motion

The equations of motion are

\[\begin{align*} x(t+1) &= r_1 x(t)(1 - x(t)) \\ y(t+1) &= r_2 f(x,y)(1 - f(x,y)), \end{align*}\]

with

\[\begin{aligned} f(x,y) = \dfrac{y + \frac{c_{xy}(x \xi )}{2}}{1 + \frac{c_{xy}}{2}(1+ \sigma_{xy} )} \end{aligned}\]

The parameter c_xy controls how strong the dynamical forcing is. If σ > 0, dynamical noise masking the influence of x on y equivalent to $\sigma_{xy} \cdot \xi$ is added at each iteration. Here,$\xi$ is a draw from a flat distribution on $[0, 1]$. Thus, setting σ_xy = 0.05 is equivalent to add dynamical noise corresponding to a maximum of $5 \%$ of the possible range of values of the logistic map.

CausalityTools.Logistic3CommonDriverType
Logistic3CommonDriver() <: DiscreteDefinition
Logistic3CommonDriver(; u₀ = [0.1, 0.2, 0.3],
    r = 4.0, σx = 0.05, σy = 0.05, σz = 0.05,
    rng = Random.default_rng())

A discrete dynamical system consisting of three coupled 1D logistic maps representing the response of two independent dynamical variables to the forcing from a common driver Runge2018. The dynamical influence goes in the directions $Z \to X$ and $Z \to Y$.

Equations of motion

The equations of motion are

\[\begin{align*} x(t+1) &= (x(t)(r - r x(t) - z(t) + \sigma_x \eta_x)) \mod 1 \\ y(t+1) &= (y(t)(r - r y(t) - z(t) + \sigma_y \eta_y)) \mod 1 \\ z(t+1) &= (z(t)(r - r z(t) + \sigma_z \eta_z)) \mod 1 \end{align*}\]

Dynamical noise may be added to each of the dynamical variables by tuning the parameters σz, σx and σz. Default values for the parameters r₁, r₂ and r₃ are set such that the system exhibits chaotic behaviour, with r₁ = r₂ = r₃ = 4.

CausalityTools.Logistic4ChainType
Logistic4Chain <: DiscreteDefinition
Logistic4Chain(; xi = rand(4),
    rx = 3.9, ry = 3.6, rz = 3.6, rw = 3.8,
    cxy = 0.4, cyz = 0.4, cyw = 0.35,
    rng = Random.default_rng())

A variant of Logistic2Bidir where four variables X, Y, Z, W are coupled in a chain X → Y → Z → W with dynamical noise.

Description

The equations of motion are

\[\begin{align*} x(t+1) &= r_x x(t)(1 - x(t)) \\ y(t+1) &= r_y f_{xy}^{t}(1 - f_{xy}^{t}) \\ z(t+1) &= r_z f_{yz}^{t}(1 - f_{yz}^{t}) \\ w(t+1) &= r_w f_{zw}^{t}(1 - f_{zw}^{t}) \\ f_{xy}^t &= \dfrac{y(t) + c_{xy}(x(t) + \sigma_{xy} \xi_{xy}^t )}{1 + c_{xy} (1 + \sigma_{xy} )} \\ f_{yz}^t &= \dfrac{z(t) + c_{yz}(y(t) + \sigma_{yz} \xi_{yz}^t )}{1 + c_{yz} (1 + \sigma_{yz} )}, \\ f_{zw}^t &= \dfrac{w(t) + c_{zw}(z(t) + \sigma_{zw} \xi_{zw}^t )}{1 + c_{zw} (1 + \sigma_{zw} )}, \end{align*}\]

where c_{xy}, c_{yz}, c_{zw} controls the coupling strength from x to y, y to z, and z to w, respectively. The magnitude of dynamical noise in these couplings are controlled by $\sigma_{xy}$, $\sigma_{yz}$, and $\sigma_{zw}$, respectively. $\xi_{xy}$, $\xi_{yz}$, and $\xi_{zw}$ are noise terms that each iteration are drawn from independent uniform distributions over the unit interval.

With default parameters and all dynamical noise terms set to zero, this is the system from Ye2015 (but note that for some initial conditions, this system wanders off to $\pm \infty$ for some of the variables).

CausalityTools.LorenzBidir6Type
LorenzBidir6 <: ContinuousDefinition
LorenzBidir6(; xi = [0.1, 0.05, 0.2, 0.2, 0.25, 0.3],
    c_xy = 0.2, c_yx = 0.2,
    a₁ = 10, a₂ = 28, a₃ = 8/3,
    b₁ = 10, b₂ = 28, b₃ = 9/3)

A bidirectionally coupled Lorenz-Lorenz system, where each subsystem is a 3D Lorenz system (Amigo & Hirata, 2018)[Amigó2018].

Description

The dynamics is generated by the following vector field

\[\begin{align*} \dot{x_1} &= -a_1 (x_1 - x_2) + c_{yx}(y_1 - x_1) \\ \dot{x_2} &= -x_1 x_3 + a_2 x_1 - x_2 \\ \dot{x_3} &= x_1 x_2 - a_3 x_3 \\ \dot{y_1} &= -b_1 (y_1 - y_2) + c_{xy} (x_1 - y_1) \\ \dot{y_2} &= -y_1 y_3 + b_2 y_1 - y_2 \\ \dot{y_3} &= y_1 y_2 - b_3 y_3 \end{align*}\]

Default values for the parameters a₁, a₂, a₃, b₁, b₂, b₃ are as in [Amigó2018].

CausalityTools.LorenzForced9Type
LorenzForced9{V} <: ContinuousDefinition
LorenzForced9(; xi = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9],
    c_xy = 0.1, c_yx = 0.1,
    c_zx = 0.05, c_zy = 0.05,
    a₁ = 10, a₂ = 28, a₃ = 8/3,
    b₁ = 10, b₂ = 28, b₃ = 8/3,
    c₁ = 10, c₂ = 28, c₃ = 8/3)

A system consisting of two bidirectionally coupled 3D Lorenz systems forced by an external 3D Lorenz system (Amigó & Hirata, 2018).

Description

The dynamics is generated by the following vector field

\[\begin{align*} \dot{x_1} &= - a_1 (x_1 - x_2) + c_{yx}(y_1 - x_1) + c_{zx}(z_1 - x_1) \\ \dot{x_2} &= - x_1 x_3 + a_2 x_1 - x_2 \\ \dot{x_3} &= x_1 x_2 - a_3 x_3 \\ \dot{y_1} &= -b_1 (y_1 - y_2) + c_{xy} (x_1 - y_1) + c_{zy}(z_1 - y_1) \\ \dot{y_2} &= - y_1 y_3 + b_2 y_1 - y_2 \\ \dot{y_3} &= y_1 y_2 - b_3 y_3 \\ \dot{z_1} &= - c_1 (z_1 - z_2) \\ \dot{z_2} &= - z_1 z_3 + c_2 z_1 - z_2 \\ \dot{z_3} &= z_1 z_2 - c_3 z_3 \end{align*}\]

CausalityTools.LorenzTransitive9Type
LorenzTransitive9 <: ContinuousDefinition
LorenzTransitive9(; xi = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.9, 0.9],
    σ₁ = 10.0, σ₂ = 10.0, σ₃ = 10.0,
    ρ₁ = 28.0, ρ₂ = 28.0, ρ₃ = 28.0,
    β₁ = 8/3,  β₂ = 8/3,  β₃ = 8.3,
    c₁₂ = 1.0, c₂₃ = 1.0)

Initalise a dynamical system consisting of three coupled Lorenz attractors with a transitive causality chain where X₁ → X₂ and X₂ → X₃. In total, the three 3D-subsystems create a 9-dimensional dynamical system.

The strength of the forcing X₁ → X₂ is controlled by the parameter c₁, and the forcing from X₂ → X₃ by c₂. The remaining parameters are the usual parameters for the Lorenz system, where the subscript i refers to the subsystem Xᵢ.

Equations of motion

The dynamics is generated by the following vector field

\[\begin{aligned} \dot{x_1} &= \sigma_1(y_1 - x_1) \\ \dot{y_1} &= \rho_1 x_1 - y_1 - x_1 z_1 \\ \dot{z_1} &= x_1 y_1 - \beta_1 z_1 \\ \dot{x_2} &= \sigma_2 (y_2 - x_2) + c_{12}(x_1 - x_2) \\ \dot{y_2} &= \rho_2 x_2 - y_2 - x_2 z_2 \\ \dot{z_2} &= x_2 y_2 - \beta_2 z_2 \\ \dot{x_3} &= \sigma_3 (y_3 - x_3) + c_{23} (x_2 - x_3) \\ \dot{y_3} &= \rho_3 x_3 - y_3 - x_3 z_3 \\ \dot{z_3} &= x_3 y_3 - \beta_3 z_3 \end{aligned}\]

Usage in literature

This system was studied by Papana et al. (2013) for coupling strengths $c_{12} = 0, 1, 3, 5$ and $c_{23} = 0, 1, 3, 5$.

CausalityTools.MCRType
MCR <: AssociationMeasure
MCR(; r, metric = Euclidean())

An association measure based on mean conditional probabilities of recurrence (MCR) introduced by Romano2007.

r is mandatory keyword which specifies the recurrence threshold when constructing recurrence matrices. It can be instance of any subtype of AbstractRecurrenceType from RecurrenceAnalysis.jl. To use any r that is not a real number, you have to do using RecurrenceAnalysis first. The metric is any valid metric from Distances.jl.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise association.
  • Use with mcr to compute the raw MCR for pairwise association.

Description

For input variables X and Y, the conditional probability of recurrence is defined as

\[M(X | Y) = \dfrac{1}{N} \sum_{i=1}^N p(\bf{y_i} | \bf{x_i}) = \dfrac{1}{N} \sum_{i=1}^N \dfrac{\sum_{i=1}^N J_{R_{i, j}}^{X, Y}}{\sum_{i=1}^N R_{i, j}^X},\]

where $R_{i, j}^X$ is the recurrence matrix and $J_{R_{i, j}}^{X, Y}$ is the joint recurrence matrix, constructed using the given metric. The measure $M(Y | X)$ is defined analogously.

Romano2007's interpretation of this quantity is that if X drives Y, then M(X|Y) > M(Y|X), if Y drives X, then M(Y|X) > M(X|Y), and if coupling is symmetric, then M(Y|X) = M(X|Y).

Input data

X and Y can be either both univariate timeseries, or both multivariate StateSpaceSets.

CausalityTools.MIH3Type

The supertype for all H3-type (three entropies) decomposition of mutual information.

CausalityTools.MIRenyiJizbaType
MIRenyiJizba <: MutualInformation

The Rényi mutual information $I_q^{R_{J}}(X; Y)$ defined in Jizba2012.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise dependence.
  • Use with mutualinfo to compute the raw mutual information.

Definition

\[I_q^{R_{J}}(X; Y) = S_q^{R}(X) + S_q^{R}(Y) - S_q^{R}(X, Y),\]

where $S_q^{R}(\cdot)$ and $S_q^{R}(\cdot, \cdot)$ the Rényi entropy and the joint Rényi entropy.

CausalityTools.MIRenyiSarbuType
MIRenyiSarbu <: MutualInformation
MIRenyiSarbu(; base = 2, q = 1.5)

The discrete Rényi mutual information from Sarbu2014.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise dependence.
  • Use with mutualinfo to compute the raw mutual information.

Description

Sarbu (2014) defines discrete Rényi mutual information as the Rényi $\alpha$-divergence between the conditional joint probability mass function $p(x, y)$ and the product of the conditional marginals, $p(x) \cdot p(y)$:

\[I(X, Y)^R_q = \dfrac{1}{q-1} \log \left( \sum_{x \in X, y \in Y} \dfrac{p(x, y)^q}{\left( p(x)\cdot p(y) \right)^{q-1}} \right)\]

See also: mutualinfo.

CausalityTools.MIShannonType
MIShannon <: MutualInformation
MIShannon(; base = 2)

The Shannon mutual information $I^S(X; Y)$.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise dependence.
  • Use with mutualinfo to compute the raw mutual information.

Discrete definition

There are many equivalent formulations of discrete Shannon mutual information. In this package, we currently use the double-sum and the three-entropies formulations.

Double sum formulation

Assume we observe samples $\bar{\bf{X}}_{1:N_y} = \{\bar{\bf{X}}_1, \ldots, \bar{\bf{X}}_n \}$ and $\bar{\bf{Y}}_{1:N_x} = \{\bar{\bf{Y}}_1, \ldots, \bar{\bf{Y}}_n \}$ from two discrete random variables $X$ and $Y$ with finite supports $\mathcal{X} = \{ x_1, x_2, \ldots, x_{M_x} \}$ and $\mathcal{Y} = y_1, y_2, \ldots, x_{M_y}$. The double-sum estimate is obtained by replacing the double sum

\[\hat{I}_{DS}(X; Y) = \sum_{x_i \in \mathcal{X}, y_i \in \mathcal{Y}} p(x_i, y_j) \log \left( \dfrac{p(x_i, y_i)}{p(x_i)p(y_j)} \right)\]

where $\hat{p}(x_i) = \frac{n(x_i)}{N_x}$, $\hat{p}(y_i) = \frac{n(y_j)}{N_y}$, and $\hat{p}(x_i, x_j) = \frac{n(x_i)}{N}$, and $N = N_x N_y$. This definition is used by mutualinfo when called with a ContingencyMatrix.

Three-entropies formulation

An equivalent formulation of discrete Shannon mutual information is

\[I^S(X; Y) = H^S(X) + H_q^S(Y) - H^S(X, Y),\]

where $H^S(\cdot)$ and $H^S(\cdot, \cdot)$ are the marginal and joint discrete Shannon entropies. This definition is used by mutualinfo when called with a ProbabilitiesEstimator.

Differential mutual information

One possible formulation of differential Shannon mutual information is

\[I^S(X; Y) = h^S(X) + h_q^S(Y) - h^S(X, Y),\]

where $h^S(\cdot)$ and $h^S(\cdot, \cdot)$ are the marginal and joint differential Shannon entropies. This definition is used by mutualinfo when called with a DifferentialEntropyEstimator.

See also: mutualinfo.

CausalityTools.MITsallisFuruichiType
MITsallisFuruichi <: MutualInformation
MITsallisFuruichi(; base = 2, q = 1.5)

The discrete Tsallis mutual information from Furuichi (2006)Furuichi2006, which in that paper is called the mutual entropy.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise dependence.
  • Use with mutualinfo to compute the raw mutual information.

Description

Furuichi's Tsallis mutual entropy between variables $X \in \mathbb{R}^{d_X}$ and $Y \in \mathbb{R}^{d_Y}$ is defined as

\[I_q^T(X; Y) = H_q^T(X) - H_q^T(X | Y) = H_q^T(X) + H_q^T(Y) - H_q^T(X, Y),\]

where $H^T(\cdot)$ and $H^T(\cdot, \cdot)$ are the marginal and joint Tsallis entropies, and q is the Tsallis-parameter. ```

See also: mutualinfo.

CausalityTools.MITsallisMartinType
MITsallisMartin <: MutualInformation
MITsallisMartin(; base = 2, q = 1.5)

The discrete Tsallis mutual information from Martin2004.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise dependence.
  • Use with mutualinfo to compute the raw mutual information.

Description

Martin et al.'s Tsallis mutual information between variables $X \in \mathbb{R}^{d_X}$ and $Y \in \mathbb{R}^{d_Y}$ is defined as

\[I_{\text{Martin}}^T(X, Y, q) := H_q^T(X) + H_q^T(Y) - (1 - q) H_q^T(X) H_q^T(Y) - H_q(X, Y),\]

where $H^S(\cdot)$ and $H^S(\cdot, \cdot)$ are the marginal and joint Shannon entropies, and q is the Tsallis-parameter.

See also: mutualinfo.

CausalityTools.MMeasureType
MMeasure <: AssociationMeasure
MMeasure(; K::Int = 2, dx = 2, dy = 2, τx = - 1, τy = -1, w = 0)

The MMeasure Andrzejak2003 is a pairwise association measure. It quantifies the probability with which close state of a target timeseries/embedding are mapped to close states of a source timeseries/embedding.

Note that τx and τy are negative by convention. See docstring for SMeasure for an explanation.

Usage

  • Use with independence to perform a formal hypothesis test for directional dependence.
  • Use with m_measure to compute the raw m-measure statistic.

Description

The MMeasure is based on SMeasure and HMeasure. It is given by

\[M^{(k)}(x|y) = \dfrac{1}{N} \sum_{i=1}^{N} \log \left( \dfrac{R_i(x) - R_i^{(k)}(x|y)}{R_i(x) - R_i^k(x)} \right),\]

where $R_i(x)$ is computed as for HMeasure, while $R_i^k(x)$ and $R_i^{(k)}(x|y)$ is computed as for SMeasure. Parameters also have the same meaning as for SMeasure/HMeasure.

CausalityTools.MediatedLink9Type
MediatedLink9 <: ContinuousDefinition
MediatedLink9(; xi = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9],
    ωx = 1.0, ωy = 1.015, ωz = 0.985,
    k = 0.15, l = 0.2, m = 10.0,
    c = 0.06) → ContinuousDynamicalSystem

A three-subsystem dynamical system where X and Y are driven by Z (Krakovská, 2018)[Krakovská2018].

Description

The dynamics is generated by the following vector field

\[\begin{aligned} dx_1 &= -\omega_x x_2 - x_3 + c*(z_1 - x_1) \\ dx_2 &= \omega_x x_1 + k*x_2 \\ dx_3 &= l + x_3(x_1 - m) \\ dy_1 &= -\omega_y y_2 - y_3 + c*(z_1 - y_1) \\ dy_2 &= \omega_y y_1 + k*y_2 \\ dy_3 &= l + y_3(y_1 - m) \\ dz_1 &= -\omega_z z_2 - z_3 \\ dz_2 &= \omega_z z_1 + k*z_2 \\ dz_3 &= l + z_3(z_1 - m) \end{aligned}\]

At the default value of the coupling constant c = 0.06, the responses X and Y are already synchronized to the driver Z.

CausalityTools.MesnerShaliziType
MesnerShalizi <: ConditionalMutualInformationEstimator
MesnerShalizi(k = 1, w = 0)

The MesnerShalizi estimator is an estimator for conditional mutual information for data that can be mixtures of discrete and continuous data MesnerShalizi2020.

CausalityTools.Nonlinear3Type
Nonlinear3 <: DiscreteDefinition
Nonlinear3(; xi = rand(3),
    σ₁ = 1.0, σ₂ = 1.0, σ₃ = 1.0,
    a₁ = 3.4, a₂ = 3.4, a₃ = 3.4,
    b₁ = 0.4, b₂ = 0.4, b₃ = 0.4,
    c₁₂ = 0.5, c₂₃ = 0.3, c₁₃ = 0.5,
    rng = Random.default_rng())

A 3d nonlinear system with nonlinear couplings $x_1 \to x_2$, $x_2 \to x_3$ and $x_1 \to x_3$. Modified from Gourévitch et al. (2006)[Gourévitch2006].

Equations of motion

\[\begin{aligned} x_1(t+1) &= a_1 x_1 (1-x_1(t))^2 e^{-x_2(t)^2} + 0.4 \xi_{1}(t) \\ x_2(t+1) &= a_1 x_2 (1-x_2(t))^2 e^{-x_2(t)^2} + 0.4 \xi_{2}(t) + b x_1 x_2 \\ x_3(t+1) &= a_3 x_3 (1-x_3(t))^2 e^{-x_3(t)^2} + 0.4 \xi_{3}(t) + c x_{2}(t) + d x_{1}(t)^2. \end{aligned}\]

[Gourévitch2006]: Gourévitch, B., Le Bouquin-Jeannès, R., & Faucon, G. (2006). Linear and nonlinear causality between signals: methods, examples and neurophysiological applications. Biological Cybernetics, 95(4), 349–369.

CausalityTools.OCEType
OCE <: GraphAlgorithm
OCE(; utest::IndependenceTest = SurrogateTest(MIShannon(), KSG2(k = 3, w = 3)),
      ctest::C = LocalPermutationTest(CMIShannon(), MesnerShalizi(k = 3, w = 3)),
      τmax::T = 1, α = 0.05)

The optimal causation entropy (OCE) algorithm for causal discovery Sun2015.

Description

The OCE algorithm has three steps to determine the parents of a variable xᵢ.

  1. Perform pairwise independence tests using utest and select the variable xⱼ(-τ) that has the highest significant (i.e. with associated p-value below α) association with xᵢ(0). Assign it to the set of selected parents P.
  2. Perform conditional independence tests using ctest, finding the parent Pₖ that has the highest association with xᵢ given the already selected parents, and add it to P. Repeat until no more variables with significant association are found.
  3. Backwards elimination of parents Pₖ of xᵢ(0) for which xᵢ(0) ⫫ Pₖ | P - {Pₖ}, where P is the set of parent nodes found in the previous steps.

τmax indicates the maximum lag τ between the target variable xᵢ(0) and its potential parents xⱼ(-τ). Sun et al. 2015's method is based on τmax = 1.

Returns

When used with infer_graph, it returns a vector p, where p[i] are the parents for each input variable. This result can be converted to a SimpleDiGraph from Graphs.jl (see example).

Usage

OCE is used with infer_graph to infer the parents of the input data. Input data must either be a Vector{Vector{<:Real}}, or a StateSpaceSet.

Examples

CausalityTools.OCESelectedParentsType
OCESelectedParents

A simple struct for storing the parents of a single variable xᵢ inferred by the OCE algorithm. When using OCE with infer_graph, a Vector{OCESelectedParents} is returned - one per variable in the input data.

Assumptions and notation

Assumes the input x is a Vector{Vector{<:Real}} or a StateSpaceSet (for which each column is treated as a variable). It contains the following fields, where we use the notation xₖ(τ) to indicate the k-th variable lagged by time-lag τ. For example, x₂(-3) is the variable x[2] lagged by 3 time steps.

Fields

  • i: The index of the target variable (i.e. xᵢ(0) is the target).
  • all_idxs: The possible variable indices of parent variables (i.e. 1:M, where M is the number of input variables).
  • parents_js: The variable indices of the selected parent variables –- one per selected parent.
  • parents_τs: The lags for the selected parent variables –- one per selected parent.
  • parents: A vector containing the raw, time-lagged data for each selected parent variables. Let τ = parents_τs[k] and j = parents_js[k]. Then parents[k] is the raw data for the variable xⱼ(-τ).
CausalityTools.OptimiseTraditionalType
OptimiseTraditional(; η = 1, maxlag = 50, maxdim = 10,
    method = delay_ifnn, dmethod = "mi_min")

Optimize embedding parameters using traditional delay embedding optimization methods with a maximum lag of maxlag and a maximum dimension of maxdim. method can be either delay_ifnn, delay_fnn or delay_f1nn.

CausalityTools.PAType
PA <: CausalityTools.AssociationMeasure
PA(ηT = 1:5, τS = 1, τC = 1)

The modified predictive asymmetry measure (Haaga et al., in revision).

Note

This is an experimental measure. It is part of an ongoing paper submission revision, but is provided here for convenience.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise or conditional directional dependence.
  • Use with asymmetry to compute the raw asymmetry distribution.

Keyword arguments

  • ηT. The prediction lags for the target variable.
  • τS. The embedding delay(s) for the source variable.
  • τC. The embedding delay(s) for the conditional variable(s).

All parameters are given as a single integer or multiple integers greater than zero.

Compatible estimators

PA/asymmetry uses condmutualinfo under the hood. Any estimator that can be used for ConditionalMutualInformation can therefore, in principle, be used with the predictive asymmetry. We recommend to use FPVP, or one of the other dedicated conditional mutual information estimators.

EstimatorTypePrinciplePairwiseConditional
CountOccurrencesProbabilitiesEstimatorFrequencies
ValueHistogramProbabilitiesEstimatorBinning (histogram)
DispersionProbabilitiesEstimatorDispersion patterns
KraskovDifferentialEntropyEstimatorNearest neighbors
ZhuDifferentialEntropyEstimatorNearest neighbors
ZhuSinghDifferentialEntropyEstimatorNearest neighbors
GaoDifferentialEntropyEstimatorNearest neighbors
GoriaDifferentialEntropyEstimatorNearest neighbors
LordDifferentialEntropyEstimatorNearest neighbors
LeonenkoProzantoSavaniDifferentialEntropyEstimatorNearest neighbors
GaussanMIMutualInformationEstimatorParametric
KSG1MutualInformationEstimatorContinuous
KSG2MutualInformationEstimatorContinuous
GaoKannanOhViswanathMutualInformationEstimatorMixed
GaoOhViswanathMutualInformationEstimatorContinuous
FPVPConditionalMutualInformationEstimatorNearest neighbors
MesnerShaliziConditionalMutualInformationEstimatorNearest neighbors
RahimzamaniConditionalMutualInformationEstimatorNearest neighbors

Examples

CausalityTools.PATestType
PATest <: IndependenceTest
PATest(measure::PA, est)

A test for directional (conditional) dependence based on the modified PA measure, estimated using the given estimator est. Compatible estimators are listed in the docstring for PA.

When used with independence, a PATestResult summary is returned.

Note

This is an experimental test. It is part of an ongoing paper submission revision, but is provided here for convenience.

CausalityTools.PATestResultType
PATestResult(n_vars, ΔA, ttest, pvalue)

Holds the result of a SurrogateTest. n_vars is the number of variables used for the test (2 for pairwise, 3 for conditional). ΔA is the distribution of asymmetries, one for each η. ttest is a one-sample t-test, and pvalue is the right-tailed p-value for the test.

CausalityTools.PCType
PC <: GraphAlgorithm
PC(pairwise_test, conditional_test;
    α = 0.05, max_depth = Inf, maxiters_orient = Inf)

The PC algorithm Spirtes2000, which is named named after the first names of the authors, Peter Spirtes and Clark Glymour, which is implemented as described in Kalisch2008.

Arguments

Keyword arguments

  • α::Real. The significance level of the test.
  • max_depth. The maximum level of conditional indendence tests to be performed. By default, there is no limit (i.e. max_depth = Inf), meaning that maximum depth is N - 2, where N is the number of input variables.
  • maxiters_orient::Real. The maximum number of times to apply the orientation rules. By default, there is not limit (i.e. maxiters_orient = Inf).
Directional measures will not give meaningful answers

During the skeleton search phase, if a significance association between two nodes are is found, then a bidirectional edge is drawn between them. The generic implementation of PC therefore doesn't currently handle directional measures such as TEShannon. The reason is that if a directional relationship X → Y exists between two nodes X and Y, then the algorithm would first draw a bidirectional arrow between X and Y when analysing the direction X → Y, and then removing it again when analysing in the direction Y → X (a similar situation would also occur for the conditional stage). This will be fixed in a future release. For now, use nondirectional measures, e.g. MIShannon and CMIShannon!

Description

When used with infer_graph on some input data x, the PC algorithm performs the following steps:

  1. Initialize an empty fully connected graph g with N nodes, where N is the number of variables and x[i] is the data for the i-th node.
  2. Reduce the fully connected g to a skeleton graph by performing pairwise independence tests between all vertices using pairwise_test. Remove any edges where adjacent vertices are found to be independent according to the test (i.e. the null hypothesis of independence cannot be rejected at significance level 1 - α).
  3. Thin the skeleton g by conditional independence testing. If x[i] ⫫ x[j] | x[Z] for some set of variables Z (not including i and j) according to conditional_test (i.e. the null hypothesis of conditional independence cannot be rejected at significance level 1 - α), then the edge between i and j is removed, and we record the separating set S(i, j) = Z. Independence tests are first performed for conditioning sets of size 1, and repeated for conditioning sets of increasing size, which in most cases limits the number of tests needed. The separating sets S(i, j), which records which variables were in the conditioning set that rendered variables i and j independent, are recorded. If max_depth is an integer, then this procedure is performed on conditioning sets of sizes 1:max_depth, and if max_depth == nothing, then all possible conditioning set sizes are potentially used.
  4. Create a directed graph dg from g by replacing every undirected edge X - Y in g by the bidirectional edge X ↔ Y (i.e. construct two directional edges X → Y and Y → X). Orientiation rules 0-3 are then repeatedly applied to dg until no more edges can be oriented:
    • Rule 0 (orients v-structures): X ↔ Y ↔ Z becomes X → Y ← Z if Y is not in the separating set S(X, Z).
    • Rule 1 (prevents new v-structures): X → Y ↔ Z becomes X → Y → Z if X and Z are not adjacent.
    • Rule 2 (avoids cycles): X → Y → Z ↔ X becomes X → Y → Z ← X.
    • Rule 3: To avoid creating cycles or new v-structures, whenever X - Y → Z, X - W → Z, and X - Z but there is no edge between Y and W, turn the undirected X - Z edge into the directed edge X → Z.

The resulting directed graph (a SimpleDiGraph from Graphs.jl) is then returned.

Examples

CausalityTools.PMIType
PMI <: AssociationMeasure
PMI(; base = 2)

The partial mutual information (PMI) measure of association Zhao2016.

Definition

PMI is defined as for variables $X$, $Y$ and $Z$ as

\[PMI(X; Y | Z) = D(p(x, y, z) || p^{*}(x|z) p^{*}(y|z) p(z)),\]

where $p(x, y, z)$ is the joint distribution for $X$, $Y$ and $Z$, and $D(\cdot, \cdot)$ is the extended Kullback-Leibler divergence from $p(x, y, z)$ to $p^{*}(x|z) p^{*}(y|z) p(z)$. See Zhao et al. (2016) for details.

Estimation

PMI can be estimated using any ProbabilitiesEstimator that implements marginal_encodings. This allows estimation of 3D contingency matrices, from which relevant probabilities for the PMI formula are extracted. See also pmi.

Properties

For the discrete case, the following identities hold in theory (when estimating PMI, they may not).

  • PMI(X, Y, Z) >= CMI(X, Y, Z) (where CMI is the Shannon CMI). Holds in theory, but when estimating PMI, the identity may not hold.
  • PMI(X, Y, Z) >= 0. Holds both in theory and for estimation using ProbabilitiesEstimators.
  • X ⫫ Y | Z => PMI(X, Y, Z) = CMI(X, Y, Z) = 0 (in theory, but not necessarily for estimation).
CausalityTools.PairwiseAsymmetricInferenceType
PairwiseAsymmetricInference <: CrossmapMeasure
PairwiseAsymmetricInference(; d::Int = 2, τ::Int = -1, w::Int = 0,
    f = Statistics.cor, embed_warn = true)

The pairwise asymmetric inference (PAI) cross mapping measure McCracken2014) is a version of ConvergentCrossMapping that searches for neighbors in mixed embeddings (i.e. both source and target variables included); otherwise, the algorithms are identical.

Specifies embedding dimension d, embedding lag τ to be used, as described below, with predict or crossmap. The Theiler window w controls how many temporal neighbors are excluded during neighbor searches (w = 0 means that only the point itself is excluded). f is a function that computes the agreement between observations and predictions (the default, f = Statistics.cor, gives the Pearson correlation coefficient).

Embedding

There are many possible ways of defining the embedding for PAI. Currently, we only implement the "add one non-lagged source timeseries to an embedding of the target" approach, which is used as an example in McCracken & Weigel's paper. Specifically: Let S(i) be the source time series variable and T(i) be the target time series variable. PairwiseAsymmetricInference produces regular embeddings with fixed dimension d and embedding lag τ as follows:

\[(S(i), T(i+(d-1)\tau, \ldots, T(i+2\tau), T(i+\tau), T(i)))_{i=1}^{N-(d-1)\tau}.\]

In this joint embedding, neighbor searches are performed in the subspace spanned by the first D variables, while the last variable is to be predicted.

With this convention, τ < 0 implies "past/present values of source used to predict target", and τ > 0 implies "future/present values of source used to predict target". The latter case may not be meaningful for many applications, so by default, a warning will be given if τ > 0 (embed_warn = false turns off warnings).

CausalityTools.PalType
Pal <: <: DifferentialEntropyEstimator
Pal(; k = 1, w = 0, p = 1.0, n::Int = 10000)

A [Shannon](@ref] and Renyi differential entropy estimator (Pàl et al., 2010).

w is the Theiler window, which determines if temporal neighbors are excluded during neighbor searches (defaults to 0, meaning that only the point itself is excluded when searching for neighbours).

Description

Pál et al. (2010)'s estimator is based on generalized nearest neighbor graphs. It is similar to several other kNN-based estimators (e.g. LeonenkoProzantoSavani). Given samples $\bf{X}_{1:n} = \{\bf{X}_1, \bf{X}_2, \ldots, \bf{X}_n \}$ where $\bf{X}_1 \in \mathbb{R}^d$ from some distribution $\mu$ over $\mathbb{R}^d$ with density function $f$, approximates the Renyi differential entropy

\[h_q^R(\bf(X)) = \dfrac{1}{1-q} \int_{\mathbb{R}^d} f^q(\bf{x}) d\bf{x}\]

using the estimator

\[\hat{H}_q^R(\bf{X_{1:n}}) = \dfrac{1}{1-q}\log \left( \dfrac{L_p(\bf{X}_{1:n})}{\gamma n^{1-p/d}} \right),\]

where $L_p(\bf{X}_{1:n}$ is the sum of the p-th powers of the Euclidean lengths of the edges of the nearest neighbor graph over $\bf{X}_{1:n}$ (see their paper for details).

The constant $\gamma$ is determined by the limit given in equation 4 in Pàl et al. (2010), and is approximated on n randomly generated points from the d-dimensional unit cube, as they describe in the end of section 3 of their paper.

CausalityTools.ParametricCopulaType
ParametricCopula <: MutualInformationEstimator
ParametricCopula(d = Normal())

A parametric copula-based mutual information estimator.

Robin et al. (2016) A statistical framework for neuroimaging data analysis based on mutual information estimated via a gaussian copula.

CausalityTools.PartialCorrelationType
PartialCorrelation <: AssociationMeasure

The correlation of two variables, with the effect of a set of conditioning variables removed.

Usage

  • Use with independence to perform a formal hypothesis test for conditional dependence.
  • Use with partial_correlation to compute the raw correlation coefficient.

Description

There are several ways of estimating the partial correlation. We follow the matrix inversion method, because for StateSpaceSets, we can very efficiently compute the required joint covariance matrix $\Sigma$ for the random variables.

Formally, let $X_1, X_2, \ldots, X_n$ be a set of $n$ real-valued random variables. Consider the joint precision matrix,$P = (p_{ij}) = \Sigma^-1$. The partial correlation of any pair of variables $(X_i, X_j)$, given the remaining variables $\bf{Z} = \{X_k\}_{i=1, i \neq i, j}^n$, is defined as

\[\rho_{X_i X_j | \bf{Z}} = -\dfrac{p_ij}{\sqrt{ p_{ii} p_{jj} }}\]

In practice, we compute the estimate

\[\hat{\rho}_{X_i X_j | \bf{Z}} = -\dfrac{\hat{p}_ij}{\sqrt{ \hat{p}_{ii} \hat{p}_{jj} }},\]

where $\hat{P} = \hat{\Sigma}^{-1}$ is the sample precision matrix.

CausalityTools.ParzenType
Parzen <: MultivariateKernel
Parzen(h)

The Parzen kernel. For a given d-dimensional StateSpaceSet x and a query point xᵢ, it computes the number of points within a hypercube of radius h centered on pᵢ.

To use it, do p = Parzen(0.2); p(x, xᵢ).

CausalityTools.PearsonCorrelationType
PearsonCorrelation

The Pearson correlation of two variables.

Usage

Description

The sample Pearson correlation coefficient for real-valued random variables $X$ and $Y$ with associated samples $\{x_i\}_{i=1}^N$ and $\{y_i\}_{i=1}^N$ is defined as

\[\rho_{xy} = \dfrac{\sum_{i=1}^n (x_i - \bar{x})(y_i - \bar{y}) }{\sqrt{\sum_{i=1}^N (x_i - \bar{x})^2}\sqrt{\sum_{i=1}^N (y_i - \bar{y})^2}},\]

where $\bar{x}$ and $\bar{y}$ are the means of the observations $x_k$ and $y_k$, respectively.

CausalityTools.PecuzalType
Pecuzal(; dmax = 3, τmax = 30, w = 0)

Used with optimize_embedding to find the optimal embedding lags using the Pecuzal algorithm. w is the Theiler window. dmax is the maximum dimension and we query lags 1:τmax.

CausalityTools.Peguin2Type
Peguin2 <: DiscreteDefinition
Peguin2(; xi = [0.5, 0.4], σ₁ = 0.1, σ₂ = 0.1,
    p₁ = 0.7, p₂ = 0.1, p₃ = 0.4, p₄ = 2.4, p₅ = 0.9, p₆ = 4)

A 2D discrete autoregressive system with nonlinear, nontrivial coupling from [1] . This system is from Péguin-Feissolle & Teräsvirta (1999)[^Péguin-Feissolle1999], and was also studied in Chávez et al. (2003)[Chávez2003].

Description

The system is defined by the equations

\[\begin{align*} x(t+1) &= p_2 + p_3 x(t-2) + c_{yx}\dfrac{p_4 - p_5 y(t-3)}{1 + e^{-p_6 y(t-3)}} + \xi_1(t) \\ y(t+1) &= p_1 y(t) + \xi_2(t). \end{align*}\]

Here, $\xi_{1,2}(t)$ are two independent normally distributed noise processes with zero mean and standard deviations $\sigma_1$ and $\sigma_2$. The $\xi_{1,2}(t)$ terms represent dynamical noise. The parameters of the original system are here tweakable.

[^Péguin-Feissolle1999]: Péguin-Feissolle, A., & Teräsvirta, T. (1999). A General Framework for Testing the Granger Noncausaality Hypothesis. Universites d’Aix-Marseille II et III. https://www.amse-aixmarseille.fr/sites/default/files/_dt/greqam/99a42.pdf

CausalityTools.PhaseType
Phase <: InstantaneousSignalProperty

Indicates that the instantaneous phases of a signal should be used.

CausalityTools.PoczosSchneiderCMIType
PoczosSchneiderCMI <: ConditionalMutualInformationEstimator
PoczosSchneiderCMI(k = 1, w = 0)

The PoczosSchneiderCMI estimator computes various (differential) conditional mutual informations, using a k-th nearest neighbor approach (Póczos & Schneider, 2012)[Póczos2012].

CausalityTools.PredictiveAsymmetryType
PredictiveAsymmetry <: AssociationMeasure
PredictiveAsymmetry(ηs = 1:15; normalize = false, f = 1.0,
    dTf = 1, dT = 1, dS = 1, τT = -1, τS = -1, base = 2)

The predictive asymmetry measure Haaga2020.

Experimental!

This is a method that does not yet appear in a peer-reviewed scientific journal. Feel free to use, but consider it experimental for now.

CausalityTools.RMCDType
RMCD <: AssociationMeasure
RMCD(; r, metric = Euclidean(), base = 2)

The recurrence measure of conditional dependence, or RMCD Ramos2017, is a recurrence-based measure that mimics the conditional mutual information, but uses recurrence probabilities.

r is a mandatory keyword which specifies the recurrence threshold when constructing recurrence matrices. It can be instance of any subtype of AbstractRecurrenceType from RecurrenceAnalysis.jl. To use any r that is not a real number, you have to do using RecurrenceAnalysis first. The metric is any valid metric from Distances.jl.

Both the pairwise and conditional RMCD is non-negative, but due to round-off error, negative values may occur. If that happens, an RMCD value of 0.0 is returned.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise or conditional association.
  • Use with rmcd to compute the raw RMCD for pairwise or conditional association.

Description

The RMCD measure is defined by

\[I_{RMCD}(X; Y | Z) = \dfrac{1}{N} \sum_{i} \left[ \dfrac{1}{N} \sum_{j} R_{ij}^{X, Y, Z} \log \left( \dfrac{\sum_{j} R_{ij}^{X, Y, Z} \sum_{j} R_{ij}^{Z} }{\sum_{j} \sum_{j} R_{ij}^{X, Z} \sum_{j} \sum_{j} R_{ij}^{Y, Z}} \right) \right],\]

where base controls the base of the logarithm. $I_{RMCD}(X; Y | Z)$ is zero when $Z = X$, $Z = Y$ or when $X$, $Y$ and $Z$ are mutually independent.

Our implementation allows dropping the third/last argument, in which case the following mutual information-like quantitity is computed (not discussed in Ramos2017.

\[ I_{RMCD}(X; Y) = \dfrac{1}{N} \sum_{i} \left[ \dfrac{1}{N} \sum_{j} R_{ij}^{X, Y} \log \left( \dfrac{\sum_{j} R_{ij}^{X} R_{ij}^{Y} }{\sum_{j} R_{ij}^{X, Y}} \right) \right]\]

CausalityTools.RahimzamaniType
Rahimzamani <: ConditionalMutualInformationEstimator
Rahimzamani(k = 1, w = 0)

The Rahimzamani estimator, short for Rahimzamani-Asnani-Viswanath-Kannan, is an estimator for Shannon conditional mutual information for data that can be mixtures of discrete and continuous data Rahimzamani2018.

This is very similar to the GaoKannanOhViswanath mutual information estimator, but has been expanded to the conditional case.

CausalityTools.RandomSegmentType
RandomSegment <: CrossmapEstimator
RandomSegment(; libsizes::Int, rng = Random.default_rng())

Indicatates that cross mapping is performed on contiguous time series segments/windows of length L with a randomly selected starting point.

This is method 2 from Luo2015.

CausalityTools.RandomVectorsType
RandomVectors <: CrossmapEstimator
RandomVectors(; libsizes, replace = false, rng = Random.default_rng())

Cross-map over N different libraries, where N = length(libsizes), and the i-th library has cardinality k = libsizes[i]. Points within each library are randomly selected, independently of other libraries, and replace controls whether or not to sample with replacement. A user-specified rng may be specified for reproducibility.

This is method 3 from Luo2015.

See also: CrossmapEstimator.

CausalityTools.Repressilator6Type
Repressilator6 <: ContinuousDefinition
Repressilator6(; xi = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6], α = 10.0, α₀ = 0.0, β = 100.0,
    n = 2) → ContinuousDynamicalSystem

A six-dimensional repressilator (or repression-driven oscillator) from Elowitz2000.

Used in Sun2014 to study the performance of the causation entropy algorithm.

Description

\[\begin{align*} \dfrac{dm_1}{dt} &= -m1 + \dfrac{\alpha}{1 + p_3^n} + \alpha_0 \\ \dfrac{dm_2}{dt} &= -m2 + \dfrac{\alpha}{1 + p_1^n} + \alpha_0 \\ \dfrac{dm_3}{dt} &= -m3 + \dfrac{\alpha}{1 + p_2^n} + \alpha_0 \\ \dfrac{dp_1}{dt} &= -\beta(p_1 - m_1) \\ \dfrac{dp_2}{dt} &= -\beta(p_2 - m_2) \\ \dfrac{dp_3}{dt} &= -\beta(p_3 - m_3) \\ \end{align*}\]

CausalityTools.RosslerBidir6Type
RosslerBidir6 <: ContinuousDefinition
RosslerBidir6(; xi = [0.1, 0.1, 0.2, 0.3, 0.3, 0.4],
    a = 0.1, b = 0.1, c = 14.0, ϵ₁ = 0.0, ϵ₂ = 0.0,
    ω₁ = 1 + 0.015, ω₂ = 1 - 0.015)

A bidirectionally coupled 6D Rossler system from Krakovská et al. (2018)[Krakovská2018].

Description

The system consists of two separate subsystems, each being a 3D Rossler attractor. The subsystems are bidirectionally coupled, influencing each other through variables $x_1$ and `x_2.

\[\begin{align*} \dfrac{dx_1}{dt} &= \omega_1 (-y_1) - z_1 + c_{21}*(x_1 - x_2) \\ \dfrac{dy_1}{dt} &= \omega_1 x_1 + a y_1 \\ \dfrac{dz_1}{dt} &= b + z_1 (x_1 - c) \\ \dfrac{dx_2}{dt} &= \omega_2 (-y_2) - z_2 + c_{12} (x_2 - x_1) \\ \dfrac{dy_2}{dt} &= \omega_2*x_2 + a*y_2 \\ \dfrac{dz_2}{dt} &= b + z_2 (x_2 - c) \\ \end{align*}\]

with $c_{12} \geq 0$ and $c_{21} \geq 0$.

CausalityTools.RosslerForced9Type
RosslerForced9 <: ContinuousDefinition
RosslerForced9(; xi = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9],
    ω₁ = 1.015, ω₂ = 0.985, ω₃ = 0.95,
    c_xy = 0.1, c_yx = 0.1,
    c_zx = 0.05, c_zy = 0.05,
    a₁ = 0.15, a₂ = 0.2, a₃ = 10,
    b₁ = 0.15, b₂ = 0.2, b₃ = 10,
    c₁ = 0.15, c₂ = 0.2, c₃ = 10)

Equations of motion for a system consisting of three coupled 3D Rössler systems ($X$, $Y$, $Z$), giving a 9D system (Amigó & Hirata, 2018)[Amigó2018].

Description

The dynamics is generated by the following vector field

\[\begin{aligned} \dot{x_1} &= -\omega_1 (x_2 + x_3) + c_{yx}(y_1 - x_1) + c_{zx}(z_1 - x_1) \\ \dot{x_2} &= \omega_1 x_1 + a_1 x_2 \\ \dot{x_3} &= a_2 + x_3 (x_1 - a_3) \\ \dot{y_1} &= -\omega_1 (y_2 + y_3) + c_{xy}(x_1 - y_1) + c_{zy}(z_1 - y_1) \\ \dot{x_2} &= \omega_2 y_1 + b_1 y_2 \\ \dot{x_3} &= b_2 + x_3 (y_1 - b_3) \\ \dot{y_1} &= -\omega_2 (z_2 + z_3) \\ \dot{x_2} &= \omega_2 z_1 + c_1 z_2 \\ \dot{x_3} &= c_2 + z_3 (z_1 - c_3). \end{aligned}\]

The external system $Z$ influences both $X$ and $Y$ (controlled by c_zx and c_zy). Simultaneously, the subsystems $X$ and $Y$ bidirectionally influences each other (controlled by c_xy and c_yx). The $X$ and $Y$ subsystems are mostly synchronized for c_xy > 0.1 or c_yx > 0.1.

CausalityTools.RosslerLorenzUnidir6Type
RosslerLorenzUnidir6 <: ContinuousDefinition
RosslerLorenzUnidir6(; xi = [0.1, 0.2, 0.3, 0.05, 0.1, 0.15],
    a₁ = 6, a₂ = 6, a₃ = 2.0, b₁ = 10, b₂ = 28, b₃ = 8/3, c_xy = 1.0)

Initialise a Rössler-Lorenz system consisting of two independent 3D subsystems: one Rössler system and one Lorenz system. They are coupled such that the second component (x₂) of the Rössler system unidirectionally forces the second component (y₂) of the Lorenz system.

The parameter c_xy controls the coupling strength. The implementation here also allows for tuning the parameters of each subsystem by introducing the constants a₁, a₂, a₃, b₁, b₂, b₃. Default values for these parameters are as in [1].

Equations of motion

The dynamics is generated by the following vector field

Description

\[\begin{align*} \dot x_1 &= -a_1(x_2 + x_3) \\ \dot x_2 &= a_2(x_1 + a_2x_2) \\ \dot x_3 &= a_1(a_2 + x_3(x_1 - a_3)) \\ \dot y_1 &= b_1(y_2 - y_1) \\ \dot y_2 &= y_1(b_2 - y_3) - y_2 +c_{xy}(x_2)^2 \\ \dot y_3 &= y_1 y_2 - b_3y_3 \end{align*}\]

with the coupling constant $c_{xy} \geq 0$.

CausalityTools.SMeasureType
SMeasure < AssociationMeasure
SMeasure(; K::Int = 2, dx = 2, dy = 2, τx = - 1, τy = -1, w = 0)

SMeasure is a bivariate association measure from Arnhold1999 and Quiroga2000 that measure directional dependence between two input (potentially multivariate) time series.

Note that τx and τy are negative; see explanation below.

Usage

  • Use with independence to perform a formal hypothesis test for directional dependence.
  • Use with s_measure to compute the raw s-measure statistic.

Description

The steps of the algorithm are:

  1. From input time series $x(t)$ and $y(t)$, construct the delay embeddings (note the positive sign in the embedding lags; therefore inputs parameters τx and τy are by convention negative).

\[\begin{align*} \{\bf{x}_i \} &= \{(x_i, x_{i+\tau_x}, \ldots, x_{i+(d_x - 1)\tau_x}) \} \\ \{\bf{y}_i \} &= \{(y_i, y_{i+\tau_y}, \ldots, y_{i+(d_y - 1)\tau_y}) \} \\ \end{align*}\]

  1. Let $r_{i,j}$ and $s_{i,j}$ be the indices of the K-th nearest neighbors of $\bf{x}_i$ and $\bf{y}_i$, respectively. Neighbors closed than w time indices are excluded during searches (i.e. w is the Theiler window).

  2. Compute the the mean squared Euclidean distance to the $K$ nearest neighbors for each $x_i$, using the indices $r_{i, j}$.

\[R_i^{(k)}(x) = \dfrac{1}{k} \sum_{i=1}^{k}(\bf{x}_i, \bf{x}_{r_{i,j}})^2\]

  • Compute the y-conditioned mean squared Euclidean distance to the $K$ nearest neighbors for each $x_i$, now using the indices $s_{i,j}$.

\[R_i^{(k)}(x|y) = \dfrac{1}{k} \sum_{i=1}^{k}(\bf{x}_i, \bf{x}_{s_{i,j}})^2\]

  • Define the following measure of independence, where $0 \leq S \leq 1$, and low values indicate independence and values close to one occur for synchronized signals.

\[S^{(k)}(x|y) = \dfrac{1}{N} \sum_{i=1}^{N} \dfrac{R_i^{(k)}(x)}{R_i^{(k)}(x|y)}\]

Input data

The algorithm is slightly modified from Grassberger1999 to allow univariate timeseries as input.

  • If x and y are StateSpaceSets then use x and y as is and ignore the parameters dx/τx and dy/τy.
  • If x and y are scalar time series, then create dx and dy dimensional embeddings, respectively, of both x and y, resulting in N different m-dimensional embedding points $X = \{x_1, x_2, \ldots, x_N \}$ and $Y = \{y_1, y_2, \ldots, y_N \}$. τx and τy control the embedding lags for x and y.
  • If x is a scalar-valued vector and y is a StateSpaceSet, or vice versa, then create an embedding of the scalar timeseries using parameters dx/τx or dy/τy.

In all three cases, input StateSpaceSets are length-matched by eliminating points at the end of the longest StateSpaceSet (after the embedding step, if relevant) before analysis.

CausalityTools.SilvermanType
Silverman <: BandwidthRule
Silverman()

A rule of thumb giving the bandwidth $h$ for kernel density estimation of entropy following the rules outlined in this paper

CausalityTools.SurrogateTestType
SurrogateTest <: IndependenceTest
SurrogateTest(measure, [est];
    nshuffles::Int = 100,
    surrogate = RandomShuffle(),
    rng = Random.default_rng(),
    show_progress = false,
)

A generic (conditional) independence test for assessing whether two variables X and Y are independendent, potentially conditioned on a third variable Z, based on surrogate data.

When used with independence, a SurrogateTestResult is returned.

Description

This is a generic one-sided hypothesis test that checks whether x and y are independent (given z, if provided) based on resampling from a null distribution assumed to represent independence between the variables. The null distribution is generated by repeatedly shuffling the input data in some way that is intended to break any dependence between the input variables.

There are different ways of shuffling, dictated by surrogate, each representing a distinct null hypothesis. For each shuffle, the provided measure is computed (using est, if relevant). This procedure is repeated nshuffles times, and a test summary is returned. The shuffled variable is always the first variable (X). Exceptions are:

  • If TransferEntropy measure such as TEShannon, then the source variable is always shuffled, and the target and conditional variable are left unshuffled.

Compatible measures

MeasurePairwiseConditionalRequires est
PearsonCorrelationNo
DistanceCorrelationNo
SMeasureNo
HMeasureNo
MMeasureNo
LMeasureNo
PairwiseAsymmetricInferenceYes
ConvergentCrossMappingYes
MIShannonYes
MIRenyiJizbaYes
MIRenyiSarbuYes
MITsallisMartinYes
MITsallisFuruichiYes
PartialCorrelationYes
CMIShannonYes
CMIRenyiJizbaYes
TEShannonYes
TERenyiJizbaYes
PMIYes

Examples

CausalityTools.SurrogateTestResultType
SurrogateTestResult(m, m_surr, pvalue)

Holds the result of a SurrogateTest. m is the measure computed on the original data. m_surr is a vector of the measure computed on permuted data, where m_surr[i] is the measure compute on the i-th permutation. pvalue is the one-sided p-value for the test.

CausalityTools.SymbolicTransferEntropyType
SymbolicTransferEntropy <: TransferEntropyEstimator
SymbolicTransferEntropy(; m = 3, τ = 1, lt = ComplexityMeasures.isless_rand

A convenience estimator for symbolic transfer entropy Staniek2008.

Description

Symbolic transfer entropy consists of two simple steps. First, the input time series are embedded with embedding lag m and delay τ. The ordinal patterns of the embedding vectors are then encoded using SymbolicPermutation with marginal_encodings. This transforms the input time series into integer time series using OrdinalPatternEncoding.

Transfer entropy is then estimated as usual on the encoded timeseries with transferentropy and the CountOccurrences naive frequency estimator.

CausalityTools.TERenyiJizbaType
TERenyiJizba() <: TransferEntropy

The Rényi transfer entropy from Jizba2012.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise and conditional dependence.
  • Use with transferentropy to compute the raw transfer entropy.

Description

The transfer entropy from source $S$ to target $T$, potentially conditioned on $C$ is defined as

\[\begin{align*} TE(S \to T) &:= I_q^{R_J}(T^+; S^- | T^-) \\ TE(S \to T | C) &:= I_q^{R_J}(T^+; S^- | T^-, C^-), \end{align*},\]

where $I_q^{R_J}(T^+; S^- | T^-)$ is Jizba et al. (2012)'s definition of conditional mutual information (CMIRenyiJizba). The variables $T^+$, $T^-$, $S^-$ and $C^-$ are described in the docstring for transferentropy.

Compatible estimators

Jizba's formulation of Renyi-type transfer entropy can currently be estimated using selected probabilities estimators and differential entropy estimators, which under the hood compute the transfer entropy as Jizba's formulation of Rényi conditional mutual information.

EstimatorTypePrincipleTERenyiJizba
CountOccurrencesProbabilitiesEstimatorFrequencies
ValueHistogramProbabilitiesEstimatorBinning (histogram)
LeonenkoProzantoSavaniDifferentialEntropyEstimatorNearest neighbors
CausalityTools.TEShannonType
TEShannon <: TransferEntropy
TEShannon(; base = 2; embedding = EmbeddingTE()) <: TransferEntropy

The Shannon-type transfer entropy measure.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise and conditional dependence.
  • Use with transferentropy to compute the raw transfer entropy.

Description

The transfer entropy from source $S$ to target $T$, potentially conditioned on $C$ is defined as

\[\begin{align*} TE(S \to T) &:= I^S(T^+; S^- | T^-) \\ TE(S \to T | C) &:= I^S(T^+; S^- | T^-, C^-) \end{align*}\]

where $I(T^+; S^- | T^-)$ is the Shannon conditional mutual information (CMIShannon). The variables $T^+$, $T^-$, $S^-$ and $C^-$ are described in the docstring for transferentropy.

Compatible estimators

Shannon-type transfer entropy can be estimated using a range of different estimators, which all boil down to computing conditional mutual information, except for TransferEntropyEstimator, which compute transfer entropy using some direct method.

EstimatorTypePrincipleTEShannon
CountOccurrencesProbabilitiesEstimatorFrequencies
ValueHistogramProbabilitiesEstimatorBinning (histogram)
SymbolicPermuationProbabilitiesEstimatorOrdinal patterns
DispersionProbabilitiesEstimatorDispersion patterns
KraskovDifferentialEntropyEstimatorNearest neighbors
ZhuDifferentialEntropyEstimatorNearest neighbors
ZhuSinghDifferentialEntropyEstimatorNearest neighbors
GaoDifferentialEntropyEstimatorNearest neighbors
GoriaDifferentialEntropyEstimatorNearest neighbors
LordDifferentialEntropyEstimatorNearest neighbors
LeonenkoProzantoSavaniDifferentialEntropyEstimatorNearest neighbors
GaussanMIMutualInformationEstimatorParametric
KSG1MutualInformationEstimatorContinuous
KSG2MutualInformationEstimatorContinuous
GaoKannanOhViswanathMutualInformationEstimatorMixed
GaoOhViswanathMutualInformationEstimatorContinuous
FPVPConditionalMutualInformationEstimatorNearest neighbors
MesnerShaliziConditionalMutualInformationEstimatorNearest neighbors
RahimzamaniConditionalMutualInformationEstimatorNearest neighbors
Zhu1TransferEntropyEstimatorNearest neighbors
LindnerTransferEntropyEstimatorNearest neighbors
CausalityTools.TEVarsType
TEVars(Tf::Vector{Int}, T::Vector{Int}, S::Vector{Int})
TEVars(Tf::Vector{Int}, T::Vector{Int}, S::Vector{Int}, C::Vector{Int})
TEVars(;Tf = Int[], T = Int[], S = Int[], C = Int[]) -> TEVars

Which axes of the state space correspond to the future of the target (Tf), the present/past of the target (T), the present/past of the source (S), and the present/past of any conditioned variables (C)? This information is used by the transfer entropy estimators to ensure that marginal distributions are computed correctly.

Indices correspond to variables of the embedding, or, equivalently, colums of a StateSpaceSet.

  • The three-argument constructor assumes there will be no conditional variables.
  • The four-argument constructor assumes there will be conditional variables.
CausalityTools.Thomas3Type
Thomas3 <: ContinuousDefinition
Thomas3(; xi = [0.11, 0.09, 0.10], b = 0.20)

Thomas' cyclically symmetric attractor is a continuous dynamical system with three variables. It has a single free parameter b, for which interesting behaviour occurs when b ∈ (0, 1). In particular, the system is chaotic whenever b < 0.20.

Definition

\[\begin{align*} \dfrac{dx}{dt} &= sin(y) - bx \\ \dfrac{dy}{dt} &= sin(z) - by \\ \dfrac{dz}{dt} &= sin(x) - bz \end{align*}\]

CausalityTools.UlamLatticeType
UlamLattice <: DiscreteDefinition
UlamLattice(; D::Int = 10; ui = sin.(1:10), ε::Real = 0.10)

A lattice of D unidirectionally coupled ulam maps Schreiber2000 defined as

\[x^{m}_{t+1} = f(\epsilon x^{m-1}_{t} + (1 - \epsilon) x_{t}^{m}),\]

where $m = 1, 2, \ldots, D$ and $f(x) = 2 - x^2$. In this system, information transfer happens only in the direction of increasing $m$.

CausalityTools.Var1Type
Var1 <: DiscreteDefinition
Var1(; xi = [0.5, 0.5, 0.5],
    a = 0.5, θ = Normal(0, 1.0), η = Normal(0, 0.2), ϵ = Normal(0, 0.3),
    rng = Random.default_rng())

A discrete vector autoregressive system where X₁ → X₂ → X₃.

CausalityTools.Verdes3Type
Verdes3 <: DiscreteDefinition
Verdes3(; ui = [0.1, 0.15, 0.2], ωy = 315, ωz = 80, σx = 0.0, σy = 0.0, σz = 0.0)

A 3D system where the response X is a highly nonlinear combination of Y and Z Verdes2005. The forcings Y and Z involve sines and cosines, and have different periods, which controlled by ωy and ωz.

\[\begin{align*} x(t+1) &= \dfrac{y(t)(18y(t) - 27y(t)^2 + 10)}{2} + z(t)(1-z(t)) + ηx \\ y(t+1) &= \dfrac{(1 - \dfrac{\cos(2\pi)}{\omega y}t)}{2} + ηy \\ z(t+1) &= \dfrac{(1 - \dfrac{\sin(2\pi)}{\omega z}t)}{2} + ηz \end{align*}\]

where ηx, ηy, ηz is gaussian noise with mean 0 and standard deviation σx, σy and σz.

CausalityTools.Zhu1Type
Zhu1 <: TransferEntropyEstimator
Zhu1(k = 1, w = 0, base = MathConstants.e)

The Zhu1 transfer entropy estimator Zhu2015.

Assumes that the input data have been normalized as described in Zhu2015. The estimator can be used both for pairwise and conditional transfer entropy.

Description

This estimator approximates probabilities within hyperrectangles surrounding each point xᵢ ∈ x using using k nearest neighbor searches. However, it also considers the number of neighbors falling on the borders of these hyperrectangles. This estimator is an extension to the entropy estimator in Singh2003.

w is the Theiler window, which determines if temporal neighbors are excluded during neighbor searches (defaults to 0, meaning that only the point itself is excluded when searching for neighbours).

CausalityTools._convert_logunitMethod
_convert_logunit(h_a::Real, , to) → h_b

Convert a number h_a computed with logarithms to base a to an entropy h_b computed with logarithms to base b. This can be used to convert the "unit" of an entropy.

CausalityTools.anishchenko1Method
anishchenko1(;u₀ = rand(2), α =3.277, s=0.1, ω=0.5*(sqrt(5)-1)) → DiscreteDynamicalSystem

Initialise the system defined by eq. 13 in Anishchenko1998, which can give strange, nonchaotic attractors.

Equations of motion

\[\begin{aligned} dx &= \alpha (1-s \cos (2 \pi \phi )) \cdot x(1-x) \\ dϕ &= (\phi + \omega ) \mod{1} \end{aligned}\]

CausalityTools.ar1_bidirMethod
ar1_bidir(;u₀ = rand(2), a₁ = 0.5, b₁ = 0.7, c_xy = 0.1, c_yx = 0.2,
    σx = 0.3, σy = 0.3) → DiscreteDynamicalSystem

A system consisting of two mutually coupled first order autoregressive processes.

Equations of motion

\[\begin{aligned} x(t+1) &= a_{1}x + c_{yx}y + \epsilon_{x} \\ y(t+1) &= b_{1}y + c_{xy}x + \epsilon_{y} \end{aligned}\]

where at each time step, $\epsilon_{x}$ and $\epsilon_{y}$ are drawn from independent normal distributions with zero mean and standard deviations σx and σy, respectively.

CausalityTools.ar1_unidirMethod
ar1_unidir(u₀, a₁ = 0.90693, b₁ = 0.40693, c_xy = 0.5,
    σ = 0.40662) → DiscreteDynamicalSystem

A bivariate, order one autoregressive model, where $x \to y$ (Paluš et al, 2018)[Paluš2018].

Equations of motion

\[\begin{aligned} x(t+1) &= a_1 x(t) + \xi_{1} \\ y(t+1) &= b_1 y(t) - c_{xy} x + \xi_{2}, \end{aligned}\]

where $\xi_{1}$ and $\xi_{2}$ are drawn from normal distributions with zero mean and standard deviation σ at each iteration.

CausalityTools.asymmetryMethod
asymmetry(measure::PA, est, x, y, [z]) → ΔA

Compute the predictive asymmetry (PA) from x to y, conditioned on z if given. Returns the distribution of asymmetry values.

Compatible estimators

The docstring for PA lists compatible estimators.

CausalityTools.backwards_eliminate!Method
backwards_eliminate!(alg::OCE, parents::OCESelectedParents, x, i; verbose)

Algorithm 2.2 in Sun et al. (2015). Perform backward elimination for the i-th variable in x, given the previously inferred parents, which were deduced using parameters in alg. Modifies parents in-place.

CausalityTools.bandwidthFunction
bandwidth(heuristic::BandwidthRule, x::AbstractStateSpaceSet)

Compute the bandwidth for a kernel density estimator for the input data x using the given heuristic.

Supported heuristics

CausalityTools.condmutualinfoMethod
condmutualinfo([measure::CMI], est::CMIEstimator, x, y, z) → cmi::Real

Estimate a conditional mutual information (CMI) of some kind (specified by measure), between x and y, given z, using the given dedicated ConditionalMutualInformationEstimator, which may be discrete, continuous or mixed.

Estimators

EstimatorPrincipleCMIShannonCMIRenyiPoczos
FPVPNearest neighborsx
MesnerShaliziNearest neighborsx
RahimzamaniNearest neighborsx
PoczosSchneiderCMINearest neighborsx
CausalityTools.condmutualinfoMethod
condmutualinfo([measure::CMI], est::DifferentialEntropyEstimator, x, y, z) → cmi::Real

Estimate the mutual information between x and y conditioned on z, using the differential version of the given conditional mutual information (CMI) measure. The DifferentialEntropyEstimator est must must support multivariate data. No bias correction is performed. If measure is not given, then the default is CMIShannon().

Note

DifferentialEntropyEstimators have their own base field which is not used here. Instead, this method creates a copy of est internally, where est.base is replaced by measure.e.base. Therefore, use measure to control the "unit" of the mutual information.

Estimators

EstimatorPrincipleCMIShannon
KraskovNearest neighbors
ZhuNearest neighbors
GaoNearest neighbors
GoriaNearest neighbors
LordNearest neighbors
LeonenkoProzantoSavaniNearest neighbors
CausalityTools.condmutualinfoMethod
condmutualinfo([measure::CMI], est::MutualInformationEstimator, x, y, z) → cmi::Real

Estimate the mutual information between x and y conditioned on z, using the given conditional mutual information (CMI) measure, computed as a a difference of mutual information terms (just the chain rule of mutual information)

\[\hat{I}(X; Y | Z) = \hat{I}(X; Y, Z) - \hat{I}(X; Z).\]

The MutualInformationEstimator est may be continuous/differential, discrete or mixed. No bias correction in performed, except the bias correction that occurs for each individual mutual information term. If measure is not given, then the default is CMIShannon().

Estimators

EstimatorTypePrincipleCMIShannon
KraskovStögbauerGrassberger1ContinuousNearest neighbors
KraskovStögbauerGrassberger2ContinuousNearest neighbors
GaoKannanOhViswanathMixedNearest neighbors
GaoOhViswanathContinuousNearest neighbors
CausalityTools.condmutualinfoMethod
condmutualinfo([measure::CMI], est::ProbabilitiesEstimator, x, y, z) → cmi::Real ∈ [0, a)

Estimate the conditional mutual information (CMI) measure between x and y given z using a sum of entropy terms, without any bias correction, using the provided ProbabilitiesEstimator est. If measure is not given, then the default is CMIShannon().

With a ProbabilitiesEstimator, the returned cmi is guaranteed to be non-negative.

Estimators

EstimatorPrincipleCMIShannonCMIRenyiSarbu
CountOccurrencesFrequencies
ValueHistogramBinning (histogram)
SymbolicPermutationOrdinal patterns
DispersionDispersion patterns
CausalityTools.contingency_matrixFunction
contingency_matrix(x, y, [z, ...]) → c::ContingencyMatrix
contingency_matrix(est::ProbabilitiesEstimator, x, y, [z, ...]) → c::ContingencyMatrix

Estimate a multidimensional contingency matrix c from input data x, y, …, where the input data can be of any and different types, as long as length(x) == length(y) == ….

For already discretized data, use the first method. For continuous data, you want to discretize the data before computing the contingency table. You can do this manually and then use the first method. Alternatively, you can provide a ProbabilitiesEstimator as the first argument to the constructor. Then the input variables x, y, … are discretized separately according to est (enforcing the same outcome space for all variables), by calling marginal_encodings.

CausalityTools.cpdagMethod
cpdag(alg::PC, skeleton::SimpleDiGraph, separating_sets::Dict{Edge, Vector{Int}}) → dg::SimpleDiGraph

Orient edges in the skeleton graph using the given separating_sets using algorithm 2 in Kalisch2008 and return the directed graph cpdag.

Description

First, a directed graph dg is constructed from skeleton, such that every undirected edge X - Y in skeleton is replaced by the bidirectional edge X ↔ Y in dg. In practices, for each X ↔ Y, we construct two directional edges X → Y and Y → X.

Orientiation rules 0-3 are then applied to dg. We use the rules as stated in Colombo & Maathuis, 2014.

  • Rule 0 (orients v-structures): X ↔ Y ↔ Z becomes X → Y ← Z if Y is not in the separating set S(X, Z).
  • Rule 1 (prevents new v-structures): X → Y ↔ Z becomes X → Y → Z if X and Z are not adjacent.
  • Rule 2 (avoids cycles): X → Y → Z ↔ X becomes X → Y → Z ← X`
  • Rule 3: To avoid creating cycles or new v-structures, X X | | | | | | Y | W becomes Y | W ↘ | ↙ ↘ ↓ ↙ Z Z
CausalityTools.crossmapFunction
crossmap(measure::CrossmapMeasure, t::AbstractVector, s::AbstractVector) → ρ::Real
crossmap(measure::CrossmapMeasure, est, t::AbstractVector, s::AbstractVector) → ρ::Vector
crossmap(measure::CrossmapMeasure, t̄::AbstractVector, S̄::AbstractStateSpaceSet) → ρ

Compute the cross map estimates between between raw time series t and s (and return the real-valued cross-map statistic ρ). If a CrossmapEstimator est is provided, cross mapping is done on random subsamples of the data, where subsampling is dictated by est (a vector of values for ρ is returned).

Alternatively, cross-map between time-aligned time series and source embedding that have been constructed by jointly (pre-embedding) some input data.

This is just a wrapper around predict that simply returns the correspondence measure between the source and the target.

CausalityTools.crossmapMethod
crossmap(x, y, d, τ; r = 0, correspondence_measure = Statistics.cor) → Float64
crossmap(x, y, d, τ, bootstrap_method::Symbol; r = 0, correspondence_measure = Statistics.cor,
    method = :segment, L = ceil(Int, (length(x)-d*τ)*0.2), nreps = 100) → Vector{Float64}
This syntax is deprecated

This syntax is deprecated. It will continue to work for CausalityTools v1.X, but will be removed in CausalityTools v2. See here for updated syntax.

Compute the cross mapping Sugihara2012 between x and y, which is the correspondence (computed using correspondence measure) between the values $y(t)$ and the cross-map estimated values $ỹ(t) | M_x$.

Returns the correspondence between original and cross mapped values (the default is ρ = correspondence_measure(y(t), ỹ(t) | M_x)).

Here, $y(t)$ are the raw values of the time series y, and $ỹ(t)$ are the predicted values computed from the out-of-sample embedding $M_X$ constructed from the time series x with embedding dimension d and embedding lag τ.

The Theiler window r indicates how many temporal neighbors of the predictee is to be excluded during the nearest neighbors search (the default r = 0 excludes only the predictee itself, while r = 2 excludes the point itself plus its two nearest neighbors in time).

If bootstrap_method is specified, then nreps different bootstrapped estimates of correspondence_measure(y(t), ỹ(t) | M_x) are returned. The following bootstrap methods are available:

  • bootstrap_method = :random selects training sets of length L consisting of randomly selected points from the embedding $M_x$ (time ordering does not matter). This is method 3 from Luo2015, which critiqued the original Sugihara2012's methodology.
  • bootstrap_method = :segment selects training sets consisting of time-contiguous segments (each of lenght L) of embedding vectors in $M_x$ (time ordering matters). This is method 2 from Luo2015.
CausalityTools.densities_at_pointsMethod
densities_at_points(kernel::MultivariateKernel, x::AbstractStateSpaceSet, bandwidth)

Compute the local densities at each point xᵢ ∈ x using the given multivariate kernel and bandwidth.

CausalityTools.distance_covarianceMethod
distance_covariance(x, y) → dcov::Real
distance_covariance(x, y, z) → pdcov::Real

Compute the empirical/sample distance covariance (Székely et al., 2007)[Székely2007] between StateSpaceSets x and y. Alternatively, compute the partial distance covariance pdcov.

CausalityTools.eliminate_loop!Method
eliminate_loop!(alg::OCE, parents::OCESelectedParents, xᵢ; verbose = false)

Inner portion of algorithm 2.2 in Sun et al. (2015). This method is called in an external while-loop that handles the variable elimination step in their line 3.

CausalityTools.empcdfMethod
empirical_cdf(x::AbstractVector{<:Real}) → x̄::Vector
empirical_cdf(x::AbstractStateSpaceSet) → x̄::StateSpaceSet

Rank each sample xᵢ ∈ x, rescale it the rank to the interval [0, 1] and return the rescaled ranks .

CausalityTools.empirical_copula_transformationMethod
empirical_copula_transformation(x::AbstractVector) → empirical_copula::Vector{<:Real}
empirical_copula_transformation(x::AbstractStateSpaceSet{D, T}) → empirical_copula::StateSpaceSet{D, T}

Apply the empirical copula transformation (as described in Pal2010; see a summary below) to the each point xᵢ ∈ x, where the xᵢ can be either univariate (x::AbstractVector) or multivariate (x::AbstractStateSpaceSet) to compute the empirical copula (here called empirical_copula).

Description

Empirical copula transformation

Assume we have a length-n sample of data points $\bf{X}_{1:n} = \{\bf{X}_1, \bf{X}_2, \ldots, \bf{X}_n \}$ where $\bf{X}_i \in \mathbb{R}^d$, which is assumed sampled from some distribution $\mu$ with density function $f$. Let $X_i^j \in \mathbb{R}$ denote the j-th coordinate of $\bf{X}_i$. Assume these points are represented as the d-dimensional StateSpaceSet which we call S (indexed like a matrix where rows are samples).

The empirical cumulative distribution function (CDF) for the j-th column of S, based on the sample $\bf{X}_{1:n}$, is defined as

\[\hat{F}_j(y) = \dfrac{\left| \{ 1 \leq i \leq n, y \leq X^j_i \} \right|}{n},\]

for any input value $y \in \mathbb{R}$ (which is in general completely unrelated to the j-th column of our sample points). Given the samples $\bf{X}_{1:n}$, we can also define a "multivariate empirical CDF" for S, $\bf{\hat{F}} : \mathbb{R}^d \to [0, 1]^d$, as

\[\hat{\bf{F}}(\bf{y}) = (\hat{F}_j(x^1), \hat{F}_j(x^2), \ldots, \hat{F}_j(x^d)),\]

for any point $\bf{y} \in \mathbb{R}^d$ (which is in general completely unrelated to our sample points, except sharing the property of being d-dimensional). Think of this as checking, for each coordinate $y^j \in \bf{y}$, how this coordinate ranks among values in S[:, j]. The map $\hat{\bf{F}}$ is called the empirical copula transformation.

Sidenote: Given only our sample, don't actually know what the underlying distribution $\mu$ is, nor what its cumulative distribution function $F$ is. But if we did, the analogous map (the copula transformation) $\bf{F} : \mathbb{R}^d \to [0, 1]^d$ would be

\[\bf{F}(\bf{y}) = (F_j(x^1), F_j(x^2), \ldots, F_j(x^d)).\]

In summary, we've defined the empirical copula transformation $\hat{\bf{F}}$ as a map from some d-dimensional space to the d-dimensional unit square. The j-th axis of $\hat{\bf{F}}$'s domain and the j-th axis of $\hat{\bf{F}}$'s codomain (i.e. the hypersquare) are linked through the ranks of S[:, j].

Empirical copula

The copula of $\mu$ is the joint distribution $\bf{F}(\bf{X}) = (F_1(X^1), F_2(X^2), \ldots, F_d(X^d))$. The empirical copula (note the lack of "transformation" here) is the set of d-dimensional empirical-copula-transformed points $\hat{\bf{Z}} = \{\bf{Z}_1, \bf{Z}_2, \ldots, \bf{Z}_n \} = \{ \hat{\bf{F}}(\bf{X_1}), \hat{\bf{F}}(\bf{X_2}), \ldots, \hat{\bf{F}}(\bf{X_n}) \}$. Note that $\hat{\bf{Z}}$ is an approximation of a sample $\{\bf{Z}_1,\bf{Z}_2, \ldots, \bf{Z}_n\} = \{\bf{F}(\bf{X}_1), \bf{F}(\bf{X}_2), \ldots, \bf{F}(\bf{X}_n)\}$ from the true copula of $\mu$ (which we in general don't know, given only some sample points).

CausalityTools.entropy_conditionalMethod
entropy_conditional(measure::ConditionalEntropy, c::ContingencyMatrix{T, 2}) where T

Estimate the discrete version of the given ConditionalEntropy measure from its direct (sum) definition, using the probabilities from a pre-computed ContingencyMatrix, constructed from two input variables x and y. This estimation method works for both numerical and categorical data. If measure is not given, then the default is CEShannon().

The convention is to compute the entropy of the variable in the first column of c conditioned on the variable in the second column of c. To do the opposite, call this function with a new contingency matrix where the order of the variables is reversed.

Compatible measures

ContingencyMatrix
CEShannon
CETsallisFuruichi
CETsallisAbe
CausalityTools.entropy_conditionalMethod
entropy_conditional([measure::ConditionalEntropy], est::DifferentialEntropyEstimator, x, y)

Estimate the entropy of x conditioned on y, using the differential/continuous version of the given conditional entropy (CE) measure. The CE is computed the difference of the joint entropy and the marginal entropy of y, using the DifferentialEntropyEstimator est, which must be compatible with multivariate data. No bias correction is applied. If measure is not given, then the default is CEShannon().

Estimators

EstimatorPrincipleCEShannonCETsallisAbeCETsallisFuruichi
KraskovNearest neighborsxx
ZhuNearest neighborsxx
ZhuSinghNearest neighborsxx
GaoNearest neighborsxx
GoriaNearest neighborsxx
LordNearest neighborsxx
LeonenkoProzantoSavaniNearest neighborsxx
CausalityTools.entropy_conditionalMethod
entropy_conditional([measure::ConditionalEntropy], est::ProbabilitiesEstimator, x, y)

Estimate the entropy of x conditioned on y, using the discrete version of the given conditional entropy (CE) measure. The CE is computed the difference of the joint entropy and the marginal entropy of y, using the ProbabilitiesEstimator est, which must compatible with multivariate data (that is, have an implementation for marginal_encodings). No bias correction is applied. If measure is not given, then the default is CEShannon().

Estimators

Joint and marginal probabilities are computed by jointly discretizing x and y using the approach given by est, and obtaining the marginal distribution for y from the joint distribution.

EstimatorPrincipleCEShannonCETsallisAbeCETsallisFuruichi
CountOccurrencesFrequenciesx
ValueHistogramBinning (histogram)x
SymbolicPermutationOrdinal patternsx
DispersionDispersion patternsx
CausalityTools.entropy_jointMethod
entropy_joint(e::EntropyDefinition, x, y)
entropy_joint(e::EntropyDefinition, c::ContingencyMatrix)

Compute the joint entropy of type e (e.g. Shannon) of the input variables x and y, or from the pre-computed contingency matrix c (see ContingencyMatrix).

Discrete definitions

Given two two discrete random variables $X$ and $Y$ with ranges $\mathcal{X}$ and $\mathcal{X}$, we define the following discrete joint entropies:

The expressions for Shannon joint entropy is from CoverThomas2006, for Rényi joint entropy from Golshani2009, and for Tsallis joint entropy from Furuichi2006.

CausalityTools.eom_logistic2_bidirMethod
logistic2_bidir(u₀, c_xy, c_yx, r₁, r₂, σ_xy, σ_yx)

Equations of motion for a bidirectional logistic model for the chaotic population dynamics of two interacting species. This system is from [1], and is given by

\[\begin{align} x(t+1) &= r_1 f_{yx}^{t}(1 - f_{yx}^{t}) \ y(t+1) &= r_2 f_{xy}^{t}(1 - f_{xy}^{t}) \ f_{xy}^t &= \dfrac{y(t) + c_{xy}(x(t) + \sigma_{xy} \xi_{xy}^t )}{1 + c_{xy} (1 + \sigma_{xy} )} \ f_{yx}^t &= \dfrac{x(t) + c_{yx}(y(t) + \sigma_{yx} \xi_{yx}^t )}{1 + c_{yx} (1 + \sigma_{yx} )}, \end{align}\]

where the coupling strength $c_{xy}$ controls how strongly species $x$ influences species $y$, and vice versa for $c_{yx}$. To simulate time-varying influence of unobserved processes, we use the dynamical noise terms $\xi_{xy}^t$ and $\xi_{yx}^t$, drawn from a uniform distribution with support on $[0, 1]$. If $\sigma_{xy} > 0$, then the influence of $x$ on $y$ is masked by dynamical noise equivalent to $\sigma_{xy} \xi_{xy}^{t}$ at the $t$-th iteration of the map, and vice versa for $\sigma_{yx}$.

References

  1. Diego, David, Kristian Agasøster Haaga, and Bjarte Hannisdal. "Transfer entropy computation using the Perron-Frobenius operator." Physical Review E 99.4 (2019): 042212.
CausalityTools.eom_logistic3Method
eom_logistic3(u, p, t)

Equations of motion for a system consisting of three coupled logistic map representing the response of two independent dynamical variables to the forcing from a common driver. The dynamical influence goes in the directions Z → X and Z → Y.

Equations of motion

The equations of motion are

\[\begin{aligned} x(t+1) = (x(t)(r - r_1 x(t) - z(t) + σ_x η_x)) \mod 1 \\ y(t+1) = (y(t)(r - r_2 y(t) - z(t) + σ_y η_y)) \mod 1 \\ z(t+1) = (z(t)(r - r_3 z(t) + σ_z η_z)) \mod 1 \end{aligned}\]

Dynamical noise may be added to each of the dynamical variables by tuning the parameters σz, σx and σz. Default values for the parameters r₁, r₂ and r₃ are set such that the system exhibits chaotic behaviour, with r₁ = r₂ = r₃ = 4.

References

  1. Runge, Jakob. Causal network reconstruction from time series: From theoretical assumptions to practical estimation, Chaos 28, 075310 (2018); doi: 10.1063/1.5025050
CausalityTools.eom_nonlinear3dMethod
eom_nonlinear3d(u₀, a₁, a₂, a₃,  b₁, b₂, b₃,
    c₁₂, c₂₃, c₁₃, σ₁, σ₂, σ₃) → DiscreteDynamicalSystem

Equations of motion for a 3d nonlinear system with nonlinear couplings $x_1 \to x_2$, $x_2 \to x_3$ and $x_1 \to x_3$. Modified from [1].

Equations of motion

The equations of motion are

\[\begin{aligned} x_1(t+1) &= a_1 x_1 (1-x_1(t))^2 e^{-x_2(t)^2} + 0.4 \xi_{1}(t) \\ x_2(t+1) &= a_1 x_2 (1-x_2(t))^2 e^{-x_2(t)^2} + 0.4 \xi_{2}(t) + b x_1 x_2 \\ x_3(t+1) &= a_3 x_3 (1-x_3(t))^2 e^{-x_3(t)^2} + 0.4 \xi_{3}(t) + c x_{2}(t) + d x_{1}(t)^2. \end{aligned}\]

References

  1. Gourévitch, B., Le Bouquin-Jeannès, R., & Faucon, G. (2006). Linear and nonlinear causality between signals: methods, examples and neurophysiological applications. Biological Cybernetics, 95(4), 349–369.
CausalityTools.estimateMethod
estimate(e::EntropyDefinition, est::InformationMeasureEstimator, input::VectorOrStateSpaceSet...)

Given some input data, estimate some information measure using the given InformationMeasureEstimator, with respect to the generalized entropy e.

CausalityTools.fishers_zMethod
fishers_z(p̂)

Compute Fisher's z-transform on the sample partial correlation coefficient (computed as the correlation between variables i and j, given the remaining variables):

\[Z(V_i, V_j | \bf{S}) = \dfrac{1}{2} \log{\left( \dfrac{1 + \hat{p}(V_i, V_j | \bf{S})}{1 - \hat{p}(V_i, V_j | \bf{S})} \right) }\]

CausalityTools.h_measureMethod
h_measure(measure::HMeasure, x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet)

Compute the HMeasure from source x to target y.

CausalityTools.henon2Method
henon2(;u₀ = [0.1, 0.2, 0.2, 0.3], c_xy = 2.0) → DiscreteDynamicalSystem

A bivariate system consisting of two identical 1D Henon maps with unidirectional forcing $X \to Y$ (Krakovská et al., 2018)[Krakovská2018].

Equations of motion

The equations of motion are

\[\begin{aligned} x_1(t+1) &= 1.4 - x_1^2(t) + 0.3x_2(t) \\ x_2(t+1) &= x_1(t) \\ y_1(t+1) &= 1.4 - [c_{xy} x_1(t) y_1(t) + (1-c_{xy}) y_1^2(t)] + 0.3 y_2(t) \\ y_2(t+1) &= y_1(t) \end{aligned}\]

CausalityTools.henon_tripleMethod
henon_triple(x, p, n) → Function

Iterate a 3D discrete system consisting of coupled Henon maps where the coupling is x1 → x2 → x3 [1]. This version allows for tweaking the parameters of the equations.

The difference equations are:

\[\begin{aligned} x_1(t+1) &= a - x_1(t)^2 + b x_1(t-2) \ x_2(t+1) &= a - c x_1(t) x_2(t)- (1 - c) x_2(t)^2 + b x_2(t-1) \ x_3(t+1) &= c x_2(t) x_3(t) - (1 - c) x_3(t)^2 + b x_3(t-1) \end{aligned}\]

Here $c$ is the coupling constant. The system becomes completely synchronized for $c >= 0.7$.

References

  1. Papana, A., Kyrtsou, C., Kugiumtzis, D., & Diks, C. (2013). Simulation study of

direct causality measures in multivariate time series. Entropy, 15(7), 2635–2661.

CausalityTools.ikedaMethod
ikeda(; u₀ = rand(2), c_xy = 1.0, c_yx = 1.0, a = 0.8, b = 12, c = 0.9,
    r₁ = rand(Uniform(0.01, 0.3)), r₂ = rand(Uniform(0.01, 0.3)), σ = 0.05)

Initialise a discrete two-dimensional Ikeda map system, adapted from [1] by adding a noise term and allowing the influences from $x \to y$ ($c_{xy}$) and from $y \to x$ ($c_{yx}$) to be adjusted.

As a rule-of-thumb, if parameters a, b, and c are drawn from uniform distributions on [0.8, 1.5], [10, 14] and [0.1, 0.9].

The difference equations are

\[\begin{aligned} x(t+1) = 1 + \mu(x \cos{(\theta)} - c_{yx} y \sin{(\theta)}) - min(\dfrac{\sigma \xi_{t}^{(1)})}{(1-x)}, \xi_{t}^{(2)} \\ y(t+1) = \mu(y \cos{(\theta)} - c_{xy} x \sin{(\theta)}) - min(\dfrac{\sigma \zeta_{t}^{(1)})}{(1-y)}, \zeta_{t}^{(2)} \end{aligned}\]

References

  1. Cao, Liangyue, Alistair Mees, and Kevin Judd. "Modeling and predicting non-stationary time series." International Journal of Bifurcation and Chaos 7.08 (1997): 1823-1831.
CausalityTools.infer_graphFunction
infer_graph(algorithm::GraphAlgorithm, x) → g

Infer graph from input data x using the given algorithm.

Returns g, whose type depends on algorithm.

CausalityTools.inv_cdf_transformMethod
inv_cdf_transform(x::AbstractVector, d::Distribution) → tx::Vector

Map each value xᵢ ∈ x to the transformed value t(xᵢ) ∈ [0, 1] using the inverse cumulative distribution function (CDF) (i.e.e quantile function) of the distribution d.

This function is meant to be used marginal empirical copula transforms (which are uniformly distributed). Since the inverse CDF is a strictly increasing function, the marginal copulas are preserved by the transformation.

CausalityTools.l_measureMethod
l_measure(measure::LMeasure, x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet)

Compute the LMeasure from source x to target y.

CausalityTools.library_indicesFunction
library_indices(measure::CCMLike, est::CrossmapEstimator, i::Int,  target, source)

Produce (randomly, if relevant) the i-th subset of indices for a CrossmapEstimator that is being applied to target and source.

CausalityTools.logistic2_bidirMethod
logistic2_bidir(;u₀ = rand(2), c_xy = 0.1, c_yx = 0.1,
    r₁ = 3.78, r₂ = 3.66, σ_xy = 0.05, σ_yx = 0.05)

A bidirectional logistic model for the chaotic population dynamics of two interacting species [1].

Equations of motion

The equations of motion are

\[\begin{align} x(t+1) &= r_1 f_{yx}^{t}(1 - f_{yx}^{t}) \\ y(t+1) &= r_2 f_{xy}^{t}(1 - f_{xy}^{t}) \\ f_{xy}^t &= \dfrac{y(t) + c_{xy}(x(t) + \sigma_{xy} \xi_{xy}^t )}{1 + c_{xy} (1 + \sigma_{xy} )} \\ f_{yx}^t &= \dfrac{x(t) + c_{yx}(y(t) + \sigma_{yx} \xi_{yx}^t )}{1 + c_{yx} (1 + \sigma_{yx} )}, \end{align}\]

where the coupling strength $c_{xy}$ controls how strongly species $x$ influences species $y$, and vice versa for $c_{yx}$. To simulate time-varying influence of unobserved processes, we use the dynamical noise terms $\xi_{xy}^t$ and $\xi_{yx}^t$, drawn from a uniform distribution with support on $[0, 1]$. If $\sigma_{xy} > 0$, then the influence of $x$ on $y$ is masked by dynamical noise equivalent to $\sigma_{xy} \xi_{xy}^{t}$ at the $t$-th iteration of the map, and vice versa for $\sigma_{yx}$.

CausalityTools.logistic2_unidirMethod
logistic2(;u₀ = rand(2), c_xy = 0.1, σ = 0.05,
    r₁ = 3.78, r₂ = 3.66) → DiscreteDynamicalSystem

Initialise a system consisting of two coupled logistic maps where X unidirectionally influences Y. By default, the parameters r₁ and r₂ are set to values yielding chaotic behaviour.

Equations of motion

The equations of motion are

\[\begin{aligned} x(t+1) &= r_1 x(t)(1 - x(t)) \\ y(t+1) &= r_2 f(x,y)(1 - f(x,y)), \end{aligned}\]

with

\[\begin{aligned} f(x,y) = \dfrac{y + \frac{c_{xy}(x \xi )}{2}}{1 + \frac{c_{xy}}{2}(1+ \sigma )} \end{aligned}\]

The parameter c_xy controls how strong the dynamical forcing is. If σ > 0, dynamical noise masking the influence of x on y equivalent to $\sigma \cdot \xi$ is added at each iteration. Here,$\xi$ is a draw from a flat distribution on $[0, 1]$. Thus, setting σ = 0.05 is equivalent to add dynamical noise corresponding to a maximum of $5 \%$ of the possible range of values of the logistic map.

References

  1. Diego, David, Kristian Agasøster Haaga, and Bjarte Hannisdal. "Transfer entropy computation using the Perron-Frobenius operator." Physical Review E 99.4 (2019): 042212.
CausalityTools.logistic3Method
logistic3(;u₀ = rand(3), r = 4,
    σx = 0.05, σy = 0.05, σz = 0.05) → DiscreteDynamicalSystem

Initialise a dynamical system consisting of three coupled logistic map representing the response of two independent dynamical variables to the forcing from a common driver. The dynamical influence goes in the directions $Z \to X$ and $Z \to Y$.

Equations of motion

The equations of motion are

\[\begin{aligned} x(t+1) = (x(t)(r - r_1 x(t) - z(t) + σ_x η_x)) \mod 1 \\ y(t+1) = (y(t)(r - r_2 y(t) - z(t) + σ_y η_y)) \mod 1 \\ z(t+1) = (z(t)(r - r_3 z(t) + σ_z η_z)) \mod 1 \end{aligned}\]

Dynamical noise may be added to each of the dynamical variables by tuning the parameters σz, σx and σz. Default values for the parameters r₁, r₂ and r₃ are set such that the system exhibits chaotic behaviour, with r₁ = r₂ = r₃ = 4.

References

  1. Runge, Jakob. Causal network reconstruction from time series: From theoretical assumptions to practical estimation, Chaos 28, 075310 (2018); doi: 10.1063/1.5025050
CausalityTools.logistic4Method
logistic4(;u₀ = rand(4), r₁ = 3.9, r₂ = 3.6, r₃ = 3.6, r₄ = 3.8,
    c₁₂ = 0.4, c₂₃ = 0.4, c₃₄ = 0.35) → DiscreteDynamicalSystem

Initialise a system of a transitive chain of four unidirectionally coupled logistic maps, where $y_1 \to y_2 \to y_3 \to y_4$ [1]. Default parameters are as in [1].

Note: With the default parameters which are as in [1], for some initial conditions, this system wanders off to $\pm \infty$ for some of the variables. Make sure that you have a good realisation before using the orbit for anything.

Equations of motion

\[\begin{aligned} y_1(t+1) &= y_1(t)(r_1 - r_1 y_1) \\ y_2(t+1) &= y_2(t)(r_2 - c_{12} y_1 - r_2 y_2) \\ y_3(t+1) &= y_3(t)(r_3 - c_{23} y_2 - r_3 y_3) \\ y_4(t+1) &= y_4(t)(r_4 - c_{34} y_3 - r_4 y_4) \end{aligned}\]

References

  1. Ye, Hao, et al. "Distinguishing time-delayed causal interactions using convergent cross mapping." Scientific reports 5 (2015): 14750
CausalityTools.m_measureMethod
m_measure(measure::MMeasure, x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet)

Compute the MMeasure from source x to target y.

CausalityTools.marginal_encodingsFunction
marginal_encodings(est::ProbabilitiesEstimator, x::VectorOrStateSpaceSet...)

Encode/discretize each input vector xᵢ ∈ x according to a procedure determined by est. Any xᵢ ∈ X that are multidimensional (StateSpaceSets) will be encoded column-wise, i.e. each column of xᵢ is treated as a timeseries and is encoded separately.

This is useful for computing any discrete information theoretic quantity, and is used internally by contingency_matrix.

Supported estimators

Many more implementations are possible. Each new implementation gives one new way of estimating the ContingencyMatrix

CausalityTools.marginal_indicesMethod
marginal_indices(x)

Returns a column vector v with the same number of elements as there are unique elements in x. v[i] is the indices of elements in x matching v[i].

For example, if the third unique element in x, and the element u₃ = unique(x)[3] appears four times in x, then v[3] is a vector of four integers indicating the position of the elements matching u₃.

CausalityTools.marginal_probs_from_μMethod
marginal_probs_from_μ(seleced_axes, visited_bins, iv::InvariantMeasure, inds_non0measure)

Estimate marginal probabilities from a pre-computed invariant measure, given a set of visited bins, an invariant measure and the indices of the positive-measure bins. The indices in selected_axes determines which marginals are selected.

CausalityTools.max_inputs_varsMethod
max_inputs_vars(m::AssociationMeasure) → nmax::Int

Return the maximum number of variables is that the measure can be computed for.

For example, MIShannon cannot be computed for more than 2 variables.

CausalityTools.max_segmentlengthFunction
max_segmentlength(x::AbstractVector, measure::CrossmapMeasure)

Given an input vector x, which is representative of the size of the other input vectors too, compute the maximum segment/library length that can be used for predictions.

CausalityTools.mcrMethod
mcr(m::MCR, x, y)

Compute the association between x and y based on conditional probabilities of recurrence using the given MCR measure, where x and y can be either univariate timeseries or multivariate StateSpaceSets.

CausalityTools.min_inputs_varsMethod
min_inputs_vars(m::AssociationMeasure) → nmin::Int

Return the minimum number of variables is that the measure can be computed for.

For example, CMIShannon requires 3 input variables.

CausalityTools.mutualinfoMethod
mutualinfo([measure::MutualInformation], m::ContingencyMatrix) → mi::Real

Estimate the mutual information between x and y, the variables corresponding to the columns and rows of the 2-dimensional contingency matrix m, respectively.

Estimates the discrete version of the given MutualInformation measure from its direct definition (double-sum), using the probabilities from a pre-computed ContingencyMatrix. If measure is not given, then the default is MIShannon().

CausalityTools.mutualinfoMethod
mutualinfo([measure::MutualInformation], est::DifferentialEntropyEstimator, x, y)

Estimate the mutual information measure between x and y by a sum of three entropy terms, without any bias correction, using any DifferentialEntropyEstimator compatible with multivariate data. See examples here. If measure is not given, then the default is MIShannon().

Note

DifferentialEntropyEstimators have their own base field which is not used here. Instead, this method creates a copy of est internally, where est.base is replaced by measure.e.base. Therefore, use measure to control the "unit" of the mutual information.

Estimators

Some MutualInformation measures can be computed using a DifferentialEntropyEstimator, provided it supports multivariate input data. These estimators compute mutual information as a sum of of entropy terms (with different dimensions), without any bias correction.

EstimatorPrincipleMIShannonMITsallisFuruichiMITsallisMartinMIRenyiJizbaMIRenyiSurbu
KraskovNearest neighborsxxxx
ZhuNearest neighborsxxxx
ZhuSinghNearest neighborsxxxx
GaoNearest neighborsxxxx
GoriaNearest neighborsxxxx
LordNearest neighborsxxxx
LeonenkoProzantoSavaniNearest neighborsxxxx
CausalityTools.mutualinfoMethod
mutualinfo([measure::MutualInformation], est::MutualInformationEstimator, x, y)

Estimate the mutual information measure between x and y using the dedicated MutualInformationEstimator est. See examples here. If measure is not given, then the default is MIShannon().

Estimators

Dedicated MutualInformationEstimators are either discrete, continuous, or a mixture of both. Typically, these estimators apply bias correction.

EstimatorTypeMIShannon
GaussanMIParametric
KSG1Continuous
KSG2Continuous
GaoKannanOhViswanathMixed
GaoOhViswanathContinuous
CausalityTools.mutualinfoMethod
mutualinfo([measure::MutualInformation], est::ProbabilitiesEstimator, x, y) → mi::Real ∈ [0, a]

Estimate the mutual information between x and y using the discrete version of the given measure, using the given ProbabilitiesEstimator est (which must accept multivariate data and have an implementation for marginal_encodings). See examples here. If measure is not given, then the default is MIShannon().

Estimators

The mutual information is computed as sum of three entropy terms, without any bias correction. The exception is when using Contingency; then the mutual information is computed using a ContingencyMatrix.

Joint and marginal probabilities are computed by jointly discretizing x and y using the approach given by est (using marginal_encodings), and obtaining marginal distributions from the joint distribution.

EstimatorPrincipleMIShannonMITsallisFuruichiMITsallisMartinMIRenyiJizbaMIRenyiSarbu
ContingencyContingency table
CountOccurrencesFrequencies
ValueHistogramBinning (histogram)
SymbolicPermutationOrdinal patterns
DispersionDispersion patterns
CausalityTools.noise_brownianMethod
noise_brownian(n::Int; lo = - 1, hi = 1)
noise_brownian(d::Distribution, n::Int)

Generate a signal consisting of n steps of Brownian noise, generated as the zero-mean and unit standard deviation normalised cumulative sum of noise generated from a uniform distribution on [lo, hi]. Optionally, a distribution d from which to sample can be provided.

Examples

# Based on uncorrelated uniform noise
noise_brownian(100)
noise_brownian(100, lo = -2, hi = 2)
noise_brownian(Uniform(-3, 3), 100)

# Based on uncorrelated Gaussian noise
μ, σ = 0, 2
noise_brownian(Normal(μ, σ), 100)
CausalityTools.noise_ugMethod
noise_ug(n::Int; μ = 0, σ = 1)

Generate a signal consisting of n steps of uncorrelated Gaussian noise from a normal distribution with mean μ and standard deviation σ.

CausalityTools.noise_uuMethod
noise_uu(n::Int, lo = - 1, hi = 1)

Generate a signal consisting of n steps of uncorrelated uniform noise from a uniform distribution on [lo, hi].

CausalityTools.nonlinear3dMethod
nonlinear3d(;u₀ = rand(3),
    σ₁ = 1.0, σ₂ = 1.0, σ₃ = 1.0,
    a₁ = 3.4, a₂ = 3.4, a₃ = 3.4,
    b₁ = 0.4, b₂ = 0.4, b₃ = 0.4,
    c₁₂ = 0.5, c₂₃ = 0.3, c₁₃ = 0.5) → DiscreteDynamicalSystem

A 3d nonlinear system with nonlinear couplings $x_1 \to x_2$, $x_2 \to x_3$ and $x_1 \to x_3$. Modified from [1].

Equations of motion

The equations of motion are

\[\begin{aligned} x_1(t+1) &= a_1 x_1 (1-x_1(t))^2 e^{-x_2(t)^2} + 0.4 \xi_{1}(t) \\ x_2(t+1) &= a_1 x_2 (1-x_2(t))^2 e^{-x_2(t)^2} + 0.4 \xi_{2}(t) + b x_1 x_2 \\ x_3(t+1) &= a_3 x_3 (1-x_3(t))^2 e^{-x_3(t)^2} + 0.4 \xi_{3}(t) + c x_{2}(t) + d x_{1}(t)^2. \end{aligned}\]

References

  1. Gourévitch, B., Le Bouquin-Jeannès, R., & Faucon, G. (2006). Linear and nonlinear causality between signals: methods, examples and neurophysiological applications. Biological Cybernetics, 95(4), 349–369.
CausalityTools.nontrivial_pegiunMethod
nontrivial_pegiun(;u₀ = rand(2), σ₁ = 0.1, σ₂ = 0.1,
    p₁ = 0.7, p₂ = 0.1, p₃ = 0.4, p₄ = 2.4, p₅ = 0.9, p₆ = 4, n = 100) → StateSpaceSet

A 2D discrete autoregressive system with nonlinear, nontrivial coupling from [1] . This system is from 1, and was also studied in 2. The version implemented here allows for tweaking the parameters of the equations. The difference equations are

\[\begin{aligned} x(t+1) &= p_2 + p_3 x(t-2) + c_{yx}\dfrac{p_4 - p_5 y(t-3)}{1 + e^{-p_6 y(t-3)}} + \xi_1(t) \ y(t+1) &= p_1 y(t) + \xi_2(t). \end{aligned}\]

Here, $\xi_{1,2}(t)$ are two independent normally distributed noise processes with zero mean and standard deviations $\sigma_1$ and $\sigma_2$. The $\xi_{1,2}(t)$ terms represent dynamical noise.

References

[1] Péguin-Feissolle, A., & Teräsvirta, T. (1999). A General Framework for Testing the Granger Noncausaality Hypothesis. Universites d’Aix-Marseille II et III. https://www.amse-aixmarseille.fr/sites/default/files/_dt/greqam/99a42.pdf

[2] Chávez, M., Martinerie, J., & Le Van Quyen, M. (2003). Statistical assessment of nonlinear causality: application to epileptic EEG signals. Journal of Neuroscience Methods, 124(2), 113–128. doi:10.1016/s0165-0270(02)00367-9 https://www.sciencedirect.com/science/article/pii/S0165027002003679

CausalityTools.optimize_marginals_teFunction
optimize_marginals_te([scheme = OptimiseTraditional()], s, t, [c]) → EmbeddingTE

Optimize marginal embeddings for transfer entropy computation from source time series s to target time series t, conditioned on c if c is given, using the provided optimization scheme.

CausalityTools.optimize_marginals_teMethod
optimize_marginals_te(opt::OptimiseTraditional, s, t, [c]; exclude_source = false) → EmbeddingTE

Optimise the marginals for a transfer entropy analysis from source time series s to target time series t, potentially given a conditional time series c.

If exclude_source == true, then no optimisation is done for the source. This is useful for SurrogateTest, because most surrogate methods accept univariate time series, and if we embed the source and it becomes multidimensional, then we can't create surrogates. A future optimization is to do column-wise surrogate generation.

CausalityTools.paiMethod
pai(x, y, d, τ; w = 0, correspondence_measure = Statistics.cor) → Float64
pai(x, y, d, τ, bootstrap_method::Symbol; w = 0, correspondence_measure = Statistics.cor,
    method = :segment, L = ceil(Int, (length(x)-d*τ)*0.2), nreps = 100) → Vector{Float64}
This syntax is deprecated

This syntax is deprecated. It will continue to work for CausalityTools v1.X, but will be removed in CausalityTools v2. See here for updated syntax.

Compute the pairwise asymmetric inference (PAI; McCracken2014) between x and y. Returns the correspondence between original and cross mapped values (the default is ρ = correspondence_measure(y(t), ỹ(t) | M_xy)).

PAI is a modification to Sugihara2012's CCM algorithm, where instead of using completely out-of-sample prediction when trying to predict $y(t)$, values about both variables are included in the embedding used to make predictions. Specifically, PAI computes the correspondence between the values $y(t)$ and the cross-map estimated values $ỹ(t) | M_xy$, where the $\tilde{y}(t)$ are the values estimated using the embedding $M_{xy} = \{ ( x_t, x_{t-\tau}, x_{t-2\tau}, \ldots, x_{t-(d - 1)\tau} ) \}$. *Note: a d+1-dimensional embedding is used, rather than the d-dimensional embedding used for CCM. Like for the CCM algorithm, the Theiler window r indicates how many temporal neighbors of the predictee is to be excluded during the nearest neighbors search (the default r = 0 excludes only the predictee itself, while r = 2 excludes the point itself plus its two nearest neighbors in time).

If bootstrap_method is specified, then nreps different bootstrapped estimates of correspondence_measure(y(t), ỹ(t) | M_x) are returned. The following bootstrap methods are available:

  • bootstrap_method = :random selects training sets of length L consisting of randomly selected points from the embedding $M_x$ (time ordering does not matter). This is method 3 from Luo2015, which critiqued the original Sugihara2012 methodology.
  • bootstrap_method = :segment selects training sets consisting of time-contiguous segments (each of lenght L) of embedding vectors in $M_x$ (time ordering matters). This is method 2 from Luo et al. (2015)Luo2015.
CausalityTools.pairwise_testMethod
pairwise_test(p::OCESelectedParents) → pairwise::Bool

If the parent set is empty, return true (a pairwise test should be performed). If the parent set is nonempty, return false (a conditional test should performed).

CausalityTools.point_density!Method
point_density!(S₁, S₂, est::LocalLikelihood, xᵢ, bwᵢ,
    neighborsᵢ::AbstractStateSpaceSet{D}) where D

Estimate the density around point xᵢ using a local likehood estimator, which is a generalization of kernel density estimation. This is done by fitting a local gaussian distribution around xᵢ from its local neighborhood (represented the points neighborsᵢ). The bandwidth bwᵢ is given by the distance from xᵢ to its k-th nearest neighbor.

S₁ is a pre-allocated length-D vector which holds the means, and S₂ is a pre-allocated D-by-D matrix which holds the covariances. Both S₁ and S₂ are zeroed every time point_density! is called.

CausalityTools.predictMethod
predict(measure::CrossmapMeasure, t::AbstractVector, s::AbstractVector) → t̂ₛ, t̄, ρ
predict(measure::CrossmapMeasure, t̄::AbstractVector, S̄::AbstractStateSpaceSet) → t̂ₛ

Perform point-wise cross mappings between source embeddings and target time series according to the algorithm specified by the given cross-map measure (e.g. ConvergentCrossMapping or PairwiseAsymmetricInference).

  • First method: Jointly embeds the target t and source s time series (according to measure) to obtain time-index aligned target timeseries and source embedding (which is now a StateSpaceSet). Then calls predict(measure, t̄, S̄) (the first method), and returns both the predictions t̂ₛ, observations and their correspondence ρ according to measure.
  • Second method: Returns a vector of predictions t̂ₛ (t̂ₛ := "predictions of based on source embedding "), where t̂ₛ[i] is the prediction for t̄[i]. It assumes pre-embedded data which have been correctly time-aligned using a joint embedding (see embed), i.e. such that t̄[i] and S̄[i] correspond to the same time index.

Description

For each i ∈ {1, 2, …, N} where N = length(t) == length(s), we make the prediction t̂[i] (an estimate of t[i]) based on a linear combination of D + 1 other points in t, where the selection of points and weights for the linear combination are determined by the D+1 nearest neighbors of the point S̄[i]. The details of point selection and weights depend on measure.

Note: Some CrossmapMeasures may define more general mapping procedures. If so, the algorithm is described in their docstring.

CausalityTools.predictive_asymmetryFunction
predictive_asymmetry(estimator::TransferEntropyEstimator, ηs; s, t, [c],
    dTf = 1, dT = 1, dS = 1, τT = -1, τS = -1, [dC = 1, τC = -1],
    normalize::Bool = false, f::Real = 1.0, base = 2) → Vector{Float64}

Compute the predictive asymmetry Haaga2020 𝔸(st) for source time series s and target time series t over prediction lags ηs, using the given estimator and embedding parameters dTf, dT, dS, τT, τS (see also EmbeddingTE)

If a conditional time series c is provided, compute 𝔸(st | c). Then, dC and τC controls the embedding dimension and embedding lag for the conditional variable.

Returns

Returns a vector containing the predictive asymmetry for each value of ηs.

Normalization (hypothesis test)

If normalize == true (the default), then compute the normalized predictive asymmetry 𝒜. In this case, for each $\eta$ in ηs, compute 𝒜(η) by normalizing 𝔸(η) to some fraction f of the mean transfer entropy over prediction lags $-\eta, ..., \eta$ (exluding lag 0). Haaga2020 uses a normalization with f=1.0 as a built-in hypothesis test, avoiding more computationally costly surrogate testing.

Estimators

Any estimator that works for transferentropy will also work with predictive_asymmetry. Check the online documentation for compatiable estimators.

Examples

using CausalityTools
# Some example time series
x, y = rand(100), rand(100)
# 𝔸(x → y) over prediction lags 1:5
𝔸reg  = predictive_asymmetry(x, y, VisitationFrequency(RectangularBinning(5)), 1:5)
Experimental!

This is a method that does not yet appear in a peer-reviewed scientific journal. Feel free to use, but consider it experimental for now. It will reappear in a 2.X release in new form once published in a peer-reviewed journal.

CausalityTools.probabilityMethod
probability(k::MultivariateKernel, data::AbstractStateSpaceSet, xᵢ; h) → p̂(xᵢ)

Compute p̂(xᵢ), the kernel density estimate of the probability p(xᵢ), given some (multivariate) data and the query point xᵢ.

This is fast if xᵢ is an SVector.

CausalityTools.rank_transformationMethod
rank_transformation(x::AbstractVector)
rank_transformation(x::AbstractStateSpaceSet) → ranks::NTuple{D, Vector}

Rank-transform each variable/column of the length-n D-dimensional StateSpaceSet x and return the rank-transformed variables as a D-tuple of length-n vectors.

Returns the unscaled ranks. Divide by n to get an approximation to the empirical cumulative distribution function (ECDF) x.

Description

Modulo division by n, rank_transformation does roughly the same as naively computing the ECDF as

[count(xᵢ .<= x)  for xᵢ in x] / length(x)

but an order of magnitude faster and with roughly three orders of magnitude less allocations. The increased efficiency of this function relative to naively computing the ECDF is because it uses sorting of the input data to determine ranks, arbitrarily breaking ties according to the sorting algorithm. Rank ties can therefore never occur, and equal values are assigned different but close ranks. To preserve ties, which you might want to do for example when dealing with categorical or integer-valued data, use (the much slower) empcdf.

CausalityTools.rmcdMethod
rmcd(measure::RMCD, x, y)
rmcd(measure::RMCD, x, y, [z, ...])

Estimate the recurrence-based measure of dependence between x and y, conditional on z if given.

Parameters for recurrence matrix estimation are given as a RMCD instance. Inputs x, y, z can be either univariate timeseries or multivariate StateSpaceSets.

CausalityTools.s_measureMethod
s_measure(measure::SMeasure, x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet)

Compute the SMeasure from source x to target y.

CausalityTools.s_measureMethod
s_measure(measure::SMeasure, x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet) → s ∈ [0, 1]

Compute the given measure to quantify the directional dependence between univariate/multivariate time series x and y.

Returns a scalar s where s = 0 indicates independence between x and y, and higher values indicate synchronization between x and y, with complete synchronization for s = 1.0.

Example

using CausalityTools

# A two-dimensional Ulam lattice map
sys = ulam(2)

# Sample 1000 points after discarding 5000 transients
orbit = trajectory(sys, 1000, Ttr = 5000)
x, y = orbit[:, 1], orbit[:, 2]

# 4-dimensional embedding for `x`, 5-dimensional embedding for `y`
m = SMeasure(dx = 4, τx = 3, dy = 5, τy = 1)
s_measure(m, x, y)
CausalityTools.select_parentsMethod
select_parents(alg::OCE, x)

The parent selection step of the OCE algorithm, which identifies the parents of each xᵢ ∈ x, assuming that x must be integer-indexable, i.e. x[i] yields the i-th variable.

CausalityTools.skeletonMethod
skeleton(algorithm::PC, x) → (g, s)

Infer the skeleton graph for the variables x using the provided algorithm. x must be some iterable where the i-th variable can be accessed as x[i].

Return a tuple of the undirected skeleton graph g::SimpleGraph, and the separating sets s::Dict{SimpleEdge, Vector{Int}}).

CausalityTools.skeleton_conditional!Method
skeleton_conditional!(alg::PC, graph, x, conditional_test::IndependenceTest)

Thin the skeleton graph, where each vertex is represented by the data x[i], by using conditional_test. Whenever x[i] ⫫ x[j] | x[S] for some set of variables not including i and j, the edge between i and j is removed, and S is stored in the separating set for i and j. This is essentially algorithm 3.2 in Colombo2014, for the cases 𝓁 >= 1.

Modifies graph in-place.

CausalityTools.skeleton_unconditional!Method
skeleton_unconditional!(alg::PC, g::SimpleGraph, x)

Perform pairwise independence tests between all vertices in the graph, where each vertex is represented by the data x[i], and remove any edges in graph where adjacent vertices are found to be independent according to the given independence test. The null hypothesis of independence is rejected whenever the p-value is below α.

This is essentially algorithm 3.2 in Colombo2014, but only considering the case 𝓁 = 0.

Modifies graph in-place.

If alg.pairwise_test is a directed test, then edges are considered one-by-one. If alg.pairwise_test is not a directed test, then edges (X → Y,Y → X`) are considered simultaneously.

CausalityTools.systemMethod
system(definition::DiscreteDefinition) → s::DiscreteDynamicalSystem
system(definition::ContinuousDefinition) → s::ContinuousDynamicalSystem

Initialize a dynamical system from definition.

CausalityTools.te_embedMethod
te_embed(source::AbstractVector{T}, target::AbstractVector{T}, p::EmbeddingTE) → (points, vars, τs)
te_embed(source::AbstractVector{T}, target::AbstractVector{T}, cond::AbstractVector{T}, p::EmbeddingTE) → (points, vars, τs)

Generalised delay reconstruction of source and target (and cond if provided) for transfer entropy computation using embedding parameters provided by the EmbeddingTE instance p.

Returns a tuple of the embedded points, vars (a TEVars instance that keeps track of which variables of the embedding belong to which marginals of the reconstruction; indices are: source = 1, target = 2, cond = 3), and a tuple τs, which stores the lags for each variable of the reconstruction.

CausalityTools.tolevels!Method
tolevels!(levels, x) → levels, dict
tolevels(x) → levels, dict

Apply the bijective map $f : \mathcal{Q} \to \mathbb{N}^+$ to each x[i] and store the result in levels[i], where levels is a pre-allocated integer vector such that length(x) == length(levels).

$\mathcal{Q}$ can be any space, and each $q \in \mathcal{Q}$ is mapped to a unique integer in the range 1, 2, …, length(unique(x)). This is useful for integer-encoding categorical data such as strings, or other complex data structures.

The single-argument method allocated a levels vector internally.

dict gives the inverse mapping.

CausalityTools.transferentropyFunction
transferentropy([measure::TEShannon], est, s, t, [c])
transferentropy(measure::TERenyiJizba, est, s, t, [c])

Estimate the transfer entropy $TE^*(S \to T)$ or $TE^*(S \to T | C)$ if c is given, using the provided estimator est, where $*$ indicates the given measure. If measure is not given, then TEShannon(; base = 2) is the default.

Arguments

  • measure: The transfer entropy measure, e.g. TEShannon or TERenyi, which dictates which formula is computed. Embedding parameters are stored in measure.embedding, and is represented by an EmbeddingTE instance. If calling transferentropy without giving measure, then the embedding is optimized by finding suitable delay embedding parameters using the "traditional" approach from DynamicalSystems.jl.
  • s: The source timeseries.
  • t: The target timeseries.
  • c: Optional. A conditional timeseries.

Description

The Shannon transfer entropy is defined as $TE^S(S \to T | C) := I^S(T^+; S^- | T^-, C^-)$, where $I^S(T^+; S^- | T^-, C^-)$ is CMIShannon, and marginals for the CMI are constructed as described in EmbeddingTE. The definition is analogous for TERenyiJizba.

If s, t, and c are univariate timeseries, then the the marginal embedding variables $T^+$ (target future), $T^-$ (target present/past), $S^-$ (source present/past) and $C^-$ (present/past of conditioning variables) are constructed by first jointly embedding s, t and c with relevant delay embedding parameters, then subsetting relevant columns of the embedding.

Since estimates of $TE^*(S \to T)$ and $TE^*(S \to T | C)$ are just a special cases of conditional mutual information where input data are marginals of a particular form of delay embedding, any combination of variables, e.g. $S = (A, B)$, $T = (C, D)$, $C = (D, E, F)$ are valid inputs (given as StateSpaceSets). In practice, however, s, t and c are most often timeseries, and if s, t and c are StateSpaceSets, it is assumed that the data are pre-embedded and the embedding step is skipped.

Compatible estimators

transferentropy is just a simple wrapper around condmutualinfo that constructs an appropriate delay embedding from the input data before CMI is estimated. Consequently, any estimator that can be used for ConditionalMutualInformation is, in principle, also a valid transfer entropy estimator. TransferEntropyEstimators are the exception - they compute transfer entropy directly.

EstimatorTypePrincipleTEShannonTERenyiJizba
CountOccurrencesProbabilitiesEstimatorFrequencies
ValueHistogramProbabilitiesEstimatorBinning (histogram)
DispersionProbabilitiesEstimatorDispersion patterns
KraskovDifferentialEntropyEstimatorNearest neighbors
ZhuDifferentialEntropyEstimatorNearest neighbors
ZhuSinghDifferentialEntropyEstimatorNearest neighbors
GaoDifferentialEntropyEstimatorNearest neighbors
GoriaDifferentialEntropyEstimatorNearest neighbors
LordDifferentialEntropyEstimatorNearest neighbors
LeonenkoProzantoSavaniDifferentialEntropyEstimatorNearest neighbors
GaussanMIMutualInformationEstimatorParametric
KSG1MutualInformationEstimatorContinuous
KSG2MutualInformationEstimatorContinuous
GaoKannanOhViswanathMutualInformationEstimatorMixed
GaoOhViswanathMutualInformationEstimatorContinuous
FPVPConditionalMutualInformationEstimatorNearest neighbors
MesnerShaliziConditionalMutualInformationEstimatorNearest neighbors
RahimzamaniConditionalMutualInformationEstimatorNearest neighbors
Zhu1TransferEntropyEstimatorNearest neighbors
LindnerTransferEntropyEstimatorNearest neighbors
HilbertTransferEntropyEstimatorHilbert transform
SymbolicTransferEntropyTransferEntropyEstimatorHilbert transform
CausalityTools.ulamFunction
ulam(D::Int = 10; u₀ = rand(D), ε::Real = 0.10) → DiscreteDynamicalSystem

A lattice of D unidirectionally coupled ulam maps Schreiber2000 defined as

\[x^{m}_{t+1} = f(\epsilon x^{m-1}_{t} + (1 - \epsilon) x_{t}^{m}),\]

where $m = 1, 2, \ldots, D$ and $f(x) = 2 - x^2$. In this system, information transfer happens only in the direction of increasing $m$.

CausalityTools.update_states!Method
update_states!(s::LaggedDiscreteDefinition, xnew::SVector{D})

Given xnew (the new current state of a system), update the past states of s.

CausalityTools.var1Method
var1(x, p, n) → DiscreteDynamicalSystem

Initialise a discrete vector autoregressive system where X₁ → X₂ → X₃.

CausalityTools.verdesMethod
verdes(;u₀ = rand(3), ωy = 315, ωz = 80,
    σx = 0.0, σy = 0.0, σz = 0.0) → DiscreteDynamicalSystem

Intitialise a 3D system where the response X is a highly nonlinear combination of Y and Z Verdes2005. The forcings Y and Z involve sines and cosines, and have different periods, which controlled by ωy and ωz.

The equations of motion are

\[\begin{aligned} x(t+1) &= \dfrac{y(t)(18y(t) - 27y(t)^2 + 10)}{2} + z(t)(1-z(t)) + ηx \ y(t+1) &= \dfrac{(1 - \dfrac{\cos(2\pi)}{\omega y}t)}{2} + ηy \ z(t+1) &= \dfrac{(1 - \dfrac{\sin(2\pi)}{\omega z}t)}{2} + ηz \end{aligned}\]

where ηx, ηy, ηz is gaussian noise with mean 0 and standard deviation σx, σy and σz.

DelayEmbeddings.embedMethod
embed(measure::CrossmapMeasure, target::AbstractVector,
    sources::AbstractVector...) → emb::StateSpaceSet{D}

Jointly embed the input vector target, together with one or more vectors s ∈ sources, according to the given CrossmapMeasure.

This produces emb, a D-dimensional StateSpaceSet where

  • The last column is always the non-lagged target variable. Typically, this is the variable we're trying to predict.
  • The D-1 first columns are the (non)lagged versions of each source time series s ∈ sources. Typically, emb[:, 1:D-1] is the subspace in which neighborhood searches are done, which forms the basis of cross-map predictions.
StatsAPI.pvalueMethod
pvalue(test::CorrTest, z, c::Int, n::Int)

Compute the two-sided p-value for the test of the partial correlation coefficient being zero, where c is the cardinality of the conditioning set and n is the number of samples.

  • Paluš2018Paluš, M., Krakovská, A., Jakubík, J., & Chvosteková, M. (2018). Causality, dynamical systems and the arrow of time. Chaos: An Interdisciplinary Journal of Nonlinear Science, 28(7), 075307. http://doi.org/10.1063/1.5019944
  • Póczos2012Póczos, B., & Schneider, J. (2012, March). Nonparametric estimation of conditional information and divergences. In Artificial Intelligence and Statistics (pp. 914-923). PMLR.
  • Székely2007Székely, G. J., Rizzo, M. L., & Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. The annals of statistics, 35(6), 2769-2794.
  • Székely2014Székely, G. J., & Rizzo, M. L. (2014). Partial distance correlation with methods for dissimilarities.
  • Krakovská2018Krakovská, A., Jakubík, J., Chvosteková, M., Coufal, D., Jajcay, N., & Paluš, M. (2018). Comparison of six methods for the detection of causality in a bivariate time series. Physical Review E, 97(4), 042207.
  • Amigó2018Amigó, José M., and Yoshito Hirata. "Detecting directional couplings from multivariate flows by the joint distance distribution." Chaos: An Interdisciplinary Journal of Nonlinear Science 28.7 (2018): 075302.
  • Amigó2018Amigó, José M., and Yoshito Hirata. "Detecting directional couplings from multivariate flows by the joint distance distribution." Chaos: An Interdisciplinary Journal of Nonlinear Science 28.7 (2018): 075302.
  • Krakovská2018Krakovská, Anna, et al. "Comparison of six methods for the detection of causality in a bivariate time series." Physical Review E 97.4 (2018): 042207
  • Pál2010Pál, D., Póczos, B., & Szepesvári, C. (2010). Estimation of Rényi entropy and mutual information based on generalized nearest-neighbor graphs. Advances in Neural Information Processing Systems, 23.
  • Chávez2003Chávez, M., Martinerie, J., & Le Van Quyen, M. (2003). Statistical assessment of nonlinear causality: application to epileptic EEG signals. Journal of Neuroscience Methods, 124(2), 113–128. doi:10.1016/s0165-0270(02)00367-9 https://www.sciencedirect.com/science/article/pii/S0165027002003679
  • Póczos2012Póczos, B., & Schneider, J. (2012, March). Nonparametric estimation of conditional information and divergences. In Artificial Intelligence and Statistics (pp. 914-923). PMLR.
  • Krakovská2018Krakovská, A., Jakubík, J., Chvosteková, M., Coufal, D., Jajcay, N., & Paluš, M. (2018). Comparison of six methods for the detection of causality in a bivariate time series. Physical Review E, 97(4), 042207.
  • Amigó2018Amigó, José M., and Yoshito Hirata. "Detecting directional couplings from multivariate flows by the joint distance distribution." Chaos: An Interdisciplinary Journal of Nonlinear Science 28.7 (2018): 075302.
  • Paluš2018Paluš, M., Krakovská, A., Jakubík, J., & Chvosteková, M. (2018). Causality, dynamical systems and the arrow of time. Chaos: An Interdisciplinary Journal of Nonlinear Science, 28(7), 075307. http://doi.org/10.1063/1.5019944
  • Székely2007Székely, G. J., Rizzo, M. L., & Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. The annals of statistics, 35(6), 2769-2794.
  • Székely2014Székely, G. J., & Rizzo, M. L. (2014). Partial distance correlation with methods for dissimilarities.
  • Székely2007Székely, G. J., Rizzo, M. L., & Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. The annals of statistics, 35(6), 2769-2794.
  • Székely2007Székely, G. J., Rizzo, M. L., & Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. The annals of statistics, 35(6), 2769-2794.
  • Krakovská2018Krakovská, A., Jakubík, J., Chvosteková, M., Coufal, D., Jajcay, N., & Paluš, M. (2018). Comparison of six methods for the detection of causality in a bivariate time series. Physical Review E, 97(4), 042207.