`CausalityTools.CausalityTools`

— Module**CausalityTools**

CausalityTools.jl is a package for quantifying associations and dynamical coupling between datasets, independence testing and causal inference.

All further information is provided in the documentation, which you can either find online or build locally by running the `docs/make.jl`

file.

**Key features**

- Association measures from conventional statistics, information theory and dynamical systems theory, for example distance correlation, mutual information, transfer entropy, convergent cross mapping and a lot more!
- A dedicated API for independence testing, which comes with automatic compatibility with every measure-estimator combination you can think of. For example, we offer the generic
`SurrogateTest`

, which is fully compatible with TimeseriesSurrogates.jl, and the`LocalPermutationTest`

for conditional indepencence testing. - A dedicated API for causal network inference based on these measures and independence tests.

**Installation**

To install the package, run `import Pkg; Pkg.add("CausalityTools")`

.

`CausalityTools.AR1Bidir`

— Type```
AR1Bidir <: DiscreteDefinition
AR1Bidir(;xi = [0.2, 0.3], a₁ = 0.5, b₁ = 0.7, c_xy = 0.1, c_yx = 0.2,
nx = Normal(0, 0.3), ny = Normal(0, 0.3),
rng::R = Random.default_rng())
```

A system consisting of two mutually coupled first order autoregressive processes.

**Equations of motion**

\[\begin{aligned} x(t+1) &= a_{1}x + c_{yx}y + \epsilon_{x} \\ y(t+1) &= b_{1}y + c_{xy}x + \epsilon_{y} \end{aligned}\]

where at each time step, $\epsilon_{x}$ and $\epsilon_{y}$ are drawn from independent normal distributions `nx`

and `ny`

, respectively.

`CausalityTools.AR1Unidir`

— Type```
AR1Unidir <: DiscreteDefinition
AR1Unidir(; ui = [0.2, 0.3], a₁ = 0.90693, b₁ = 0.40693, c_xy = 0.5,
nx = Normal(0, 0.40662), ny = Normal(0, 0.40662),
rng::R = Random.default_rng())
```

A bivariate, order one autoregressive model, where $x \to y$ (Paluš et al, 2018)^{[Paluš2018]}.

**Equations of motion**

\[\begin{aligned} x(t+1) &= a_1 x(t) + \xi_{1} \\ y(t+1) &= b_1 y(t) - c_{xy} x + \xi_{2}, \end{aligned}\]

where $\xi_{1}$ and $\xi_{2}$ are drawn from normal distributions `nx`

and `ny`

at each iteration.

`CausalityTools.Amplitude`

— Type`Amplitude <: InstantaneousSignalProperty`

Indicates that the instantaneous amplitudes of a signal should be used.

`CausalityTools.Anishchenko`

— Type```
Anishchenko <: DiscreteDefinition
Anishchenko(;u₀ = rand(2), α =3.277, s=0.1, ω=0.5*(sqrt(5)-1)) → DiscreteDynamicalSystem
```

Initialise the system defined by eq. 13 in Anishchenko1998, which can give strange, nonchaotic attractors.

**Equations of motion**

\[\begin{aligned} dx &= \alpha (1-s \cos (2 \pi \phi )) \cdot x(1-x) \\ dϕ &= (\phi + \omega ) \mod{1} \end{aligned}\]

`CausalityTools.AssociationMeasure`

— Type`AssociationMeasure`

The supertype of all association measures.

`CausalityTools.BandwidthRule`

— TypeThe supertype for all kernel density bandwidth rules.

`CausalityTools.CEShannon`

— Type```
CEShannon <: ConditionalEntropy
CEShannon(; base = 2,)
```

The`Shannon`

conditional entropy measure.

**Discrete definition**

**Sum formulation**

The conditional entropy between discrete random variables $X$ and $Y$ with finite ranges $\mathcal{X}$ and $\mathcal{Y}$ is defined as

\[H^{S}(X | Y) = -\sum_{x \in \mathcal{X}, y \in \mathcal{Y}} = p(x, y) \log(p(x | y)).\]

This is the definition used when calling `entropy_conditional`

with a `ContingencyMatrix`

.

**Two-entropies formulation**

Equivalently, the following difference of entropies hold

\[H^S(X | Y) = H^S(X, Y) - H^S(Y),\]

where $H^S(\cdot$ and $H^S(\cdot | \cdot)$ are the `Shannon`

entropy and Shannon joint entropy, respectively. This is the definition used when calling `entropy_conditional`

with a `ProbabilitiesEstimator`

.

**Differential definition**

The differential conditional Shannon entropy is analogously defined as

\[H^S(X | Y) = h^S(X, Y) - h^S(Y),\]

where $h^S(\cdot$ and $h^S(\cdot | \cdot)$ are the `Shannon`

differential entropy and Shannon joint differential entropy, respectively. This is the definition used when calling `entropy_conditional`

with a `DifferentialEntropyEstimator`

.

`CausalityTools.CETsallisAbe`

— Type```
CETsallisAbe <: ConditionalEntropy
CETsallisAbe(; base = 2, q = 1.5)
```

Abe2001's discrete Tsallis conditional entropy measure.

**Definition**

Abe & Rajagopal's Tsallis conditional entropy between discrete random variables $X$ and $Y$ with finite ranges $\mathcal{X}$ and $\mathcal{Y}$ is defined as

\[H_q^{T_A}(X | Y) = \dfrac{H_q^T(X, Y) - H_q^T(Y)}{1 + (1-q)H_q^T(Y)},\]

where $H_q^T(\cdot)$ and $H_q^T(\cdot, \cdot)$ is the `Tsallis`

entropy and the joint Tsallis entropy.

`CausalityTools.CETsallisFuruichi`

— Type```
CETsallisFuruichi <: ConditionalEntropy
CETsallisFuruichi(; base = 2, q = 1.5)
```

Furuichi (2006)'s discrete Tsallis conditional entropy measure.

**Definition**

Furuichi's Tsallis conditional entropy between discrete random variables $X$ and $Y$ with finite ranges $\mathcal{X}$ and $\mathcal{Y}$ is defined as

\[H_q^T(X | Y) = -\sum_{x \in \mathcal{X}, y \in \mathcal{Y}} p(x, y)^q \log_q(p(x | y)),\]

when $q \neq 1$. For $q = 1$, $H_q^T(X | Y)$ reduces to the Shannon conditional entropy:

\[H_{q=1}^T(X | Y) = -\sum_{x \in \mathcal{X}, y \in \mathcal{Y}} = p(x, y) \log(p(x | y))\]

`CausalityTools.CMIRenyiJizba`

— Type`CMIRenyiJizba <: ConditionalMutualInformation`

The Rényi conditional mutual information $I_q^{R_{J}}(X; Y | Z$ defined in Jizba2012.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for pairwise dependence. - Use with
`condmutualinfo`

to compute the raw conditional mutual information.

**Definition**

\[I_q^{R_{J}}(X; Y | Z) = I_q^{R_{J}}(X; Y, Z) - I_q^{R_{J}}(X; Z),\]

where $I_q^{R_{J}}(X; Z)$ is the `MIRenyiJizba`

mutual information.

`CausalityTools.CMIRenyiPoczos`

— Type`CMIRenyiPoczos <: ConditionalMutualInformation`

The differential Rényi conditional mutual information $I_q^{R_{P}}(X; Y | Z)$ defined in (Póczos & Schneider, 2012)^{[Póczos2012]}.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for pairwise dependence. - Use with
`condmutualinfo`

to compute the raw conditional mutual information.

**Definition**

\[\begin{align*} I_q^{R_{P}}(X; Y | Z) = \dfrac{1}{q-1} \int \int \int \dfrac{p_Z(z) p_{X, Y | Z}^q}{( p_{X|Z}(x|z) p_{Y|Z}(y|z) )^{q-1}} \\ \mathbb{E}_{X, Y, Z} \sim p_{X, Y, Z} \left[ \dfrac{p_{X, Z}^{1-q}(X, Z) p_{Y, Z}^{1-q}(Y, Z) }{p_{X, Y, Z}^{1-q}(X, Y, Z) p_Z^{1-q}(Z)} \right] \end{align*}\]

`CausalityTools.CMIRenyiSarbu`

— Type```
CMIRenyiSarbu <: ConditionalMutualInformation
CMIRenyiSarbu(; base = 2, definition = CMIRenyiSarbuSarbu())
```

The Rényi conditional mutual information from Sarbu2014.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for pairwise dependence. - Use with
`condmutualinfo`

to compute the raw conditional mutual information.

**Discrete description**

Assume we observe three discrete random variables $X$, $Y$ and $Z$. Sarbu (2014) defines discrete conditional Rényi mutual information as the conditional Rényi $\alpha$-divergence between the conditional joint probability mass function $p(x, y | z)$ and the product of the conditional marginals, $p(x |z) \cdot p(y|z)$:

\[I(X, Y; Z)^R_q = \dfrac{1}{q-1} \sum_{z \in Z} p(Z = z) \log \left( \sum{x \in X}\sum{y \in Y} \dfrac{p(x, y|z)^q}{\left( p(x|z)\cdot p(y|z) \right)^{q-1}} \right)\]

See also: `condmutualinfo`

.

`CausalityTools.CMIShannon`

— Type```
CMIShannon <: ConditionalMutualInformation
CMIShannon(; base = 2)
```

The Shannon conditional mutual information (CMI) $I^S(X; Y | Z)$.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for pairwise dependence. - Use with
`condmutualinfo`

to compute the raw conditional mutual information.

**Supported definitions**

Consider random variables $X \in \mathbb{R}^{d_X}$ and $Y \in \mathbb{R}^{d_Y}$, given $Z \in \mathbb{R}^{d_Z}$. The Shannon conditional mutual information is defined as

\[\begin{align*} I(X; Y | Z) &= H^S(X, Z) + H^S(Y, z) - H^S(X, Y, Z) - H^S(Z) \\ &= I^S(X; Y, Z) + I^S(X; Y) \end{align*},\]

where $I^S(\cdot; \cdot)$ is the Shannon mutual information `MIShannon`

, and $H^S(\cdot)$ is the `Shannon`

entropy.

Differential Shannon CMI is obtained by replacing the entropies by differential entropies.

See also: `condmutualinfo`

.

`CausalityTools.ChaoticMaps3`

— Type```
ChaoticMaps3() <: DiscreteDefinition
ChaoticMaps3(; ui = [0.2, 0.1, 0.3], r = 3.4, c_xy = 0.5, c_xz = 0.5, c_yz = 0.3)
```

A model consisting of three coupled 1D maps, where $x \to y$ and $x \to z$ Chen2004.

**Equations of motion**

\[\begin{aligned} x(t) &= r x(t-1)( 1 - x(t-1)^2 ) e^{-x(t-1)^2} \\ y(t) &= r y(t-1)( 1 - y(t-1)^2 ) e^{-y(t-1)^2} + c_{xy} x(t-1) \\ z(t) &= r z(t-1)( 1 - z(t-1)^2 ) e^{-z(t-1)^2} + c_{xz} x(t-1) + c_{yz} y(t-1) \end{aligned}\]

The parameters `r`

, `c_xy`

and `c_yz`

do not appear in the original paper, but are added here for explorative purposes.

`CausalityTools.ChaoticNoisyLinear2`

— Type```
ChaoticNoisyLinear2 <: DiscreteDefinition
ChaoticNoisyLinear2(; xi = [0.1, 0.2], c = 0.5,
nx = Normal(0, 0.05), ny = Normal(0, 0.05),
rng = Random.default_rng())
```

A bivariate system of two chaotic maps that are linearly coupled from `x → y`

with coupling strength `c`

.

**Definition**

\[\begin{align*} x(t+1) &= 3.4 x(t) (1 - x(t)^2) e^{-x(t)^2} + 0.8x(t-1) + \xi_x \\ y(t+1) &= 3.4 y(t) (1 - y(t)^2) e^{-y(t)^2} + 0.8y(t-1) + \xi_y + c x(t-2) \end{align*}\]

Process noise $\xi_x$ and $\xi_y$ is drawn at each iteration from `nx`

and `ny`

.

`CausalityTools.ChuaCircuitsBidir6`

— Type```
ChuaCircuitsBidir6 <: ContinuousDefinition
ChuaCircuitsBidir6(;u₀ = [0.1, 0.1, 0.2, 0.15, 0.15, 0.22],
α₁ = 7.0, α₂ = 7.0, β₁ = 14.286, β₂ = 14.286,
F₁ = 1.5, F₂ = 1.5, ω₁ = 3.0, ω₂ = 3.0,
n1 = Normal(0, 0.1),
n2 = Normal(0, 0.1),
c12 = 0.1, c21 = 0.1, m₀ = -1/7, m₁ = 2/7)
```

Initialize a bidirectionally coupled system consisting of two driven Chua circuits, X₁ and X₂ Murali1993.

**Description**

The subsystems are mutually coupled by a linear resistor, where `ϵ12`

controls the influence of X₁ on X₂, and `c21`

controls the influence of X₂ on X₁. The parameters for the subsystems are set equal to each other, as in the original paper, but can here be tuned individually for each subsystem.

\[\begin{align*} \dfrac{dx_1}{dt} &= \alpha_1(y_1, h(x_1)) - \alpha_1 \epsilon(x_1 - x_2) \\ \dfrac{dy_1}{dt} &= x_1 - y_1 + z_1 \\ \dfrac{dz_1}{dt} &= -\beta_1 y_1 + F_1 sin(\omega_1 t) + \epsilon_1 \\ \dfrac{dx_2}{dt} &= \alpha_2 (y_2, h(x_2)) - \alpha_2 c_{12}(x_1 - x_2) \\ \dfrac{dy_2}{dt} &= x_2 - y_2 + z_2 \\ \dfrac{dz_2}{dt} &= -\beta_2 y_2 + F_2 sin(\omega_2 t) + \epsilon_2, \end{align*}\]

where $h(x) = M_1x + 0.5(M_0 - M_1)(|x+1| - |x - 1|)$ and $\epsilon_1, \epsilon_2$ are noise terms that at each integration step is drawn independently from the normal distributions `n1`

and `n2`

, respectively.

`CausalityTools.ChuaScrollSine3`

— Type```
ChuaScrollSine3 <: ContinuousDefinition
ChuaScrollSine3(; xi = [0.1, 0.2, 0.3],
α = 10.814, β = 14, γ = 0, a = 1.3, b = 0.11, c = 2,
nx = Normal(0.0, 0.01),
ny = Normal(0.0, 0.01)
nz = Normal(0.0, 0.01))
```

An adjusted Chua system giving rise to n-scroll attractors Tang2001.

**Description**

The dynamics is generated by the following vector field

\[\begin{align*} \dot{x} &= \alpha (y - fx) + \eta x \\ \dot{y} &= x - y + z + \eta y \\ \dot{z} &= -\beta y - \gamma z + \eta z \end{align*}\]

where $\eta x$, $\eta z$, and $\eta z$ are drawn independently from normal distributions `nx`

, `ny`

and `nz`

each iteration.

$fx$ is given by the following conditions:

```
n::Int = c + 1
if x >= 2*a*c
fx = (b*pi/2*a)*(x - 2*a*c)
elseif -2*a*c < x < 2*a*c
d = ifelse(isodd(n), pi, 0)
fx = -b*sin((pi*x/2*a) + d)
elseif x <= -2*a*c
fx = (b*pi/2*a)*(x + 2*a*c)
end
```

`CausalityTools.ConditionalEntropy`

— TypeThe supertype for all conditional entropies.

`CausalityTools.ConditionalEntropyDefinition`

— TypeThe supertype for all conditional entropy definitions.

`CausalityTools.ConditionalMutualInformation`

— Type```
ConditionalMutualInformation <: AssociationMeasure
CMI # alias
```

The supertype of all conditional mutual information measures. Concrete subtypes are

`CausalityTools.ConditionalMutualInformationEstimator`

— Type```
ConditionalMutualInformationEstimator <: InformationEstimator
CMIEstimator # alias
```

The supertype of all conditional mutual information estimators.

**Subtypes**

`CausalityTools.Contingency`

— Type```
Contingency <: ProbabilitiesEstimator
Contingency(est::Union{ProbabilitiesEstimator, Nothing} = nothing)
```

`Contingency`

is a probabilities estimator that transforms input data to a multidimensional probability mass function (internally represented as `ContingencyMatrix`

.

It works directly on raw discrete/categorical data. Alternatively, if a `ProbabilitiesEstimator`

`est`

for which `marginal_encodings`

is implemented is given, then input data are first discretized before creating the contingency matrix.

The `Contingency`

estimator differs from other `ProbabilitiesEstimator`

s in that it's not compatible with `probabilities`

and other methods. Instead, it is used to construct `ContingencyMatrix`

, from which probabilities can be computed.

`CausalityTools.ContingencyMatrix`

— Type```
ContingencyMatrix{T, N} <: Probabilities{T, N}
ContingencyMatrix(frequencies::AbstractArray{Int, N})
```

A contingency matrix is essentially a multivariate analogue of `Probabilities`

that also keep track of raw frequencies.

The contingency matrix can be constructed directyly from an `N`

-dimensional `frequencies`

array. Alternatively, the `contingency_matrix`

function performs counting for you; this works on both raw categorical data, or by first discretizing data using a a `ProbabilitiesEstimator`

.

**Description**

A `ContingencyMatrix`

`c`

is just a simple wrapper around around `AbstractArray{T, N}`

. Indexing `c`

with multiple indices `i, j, …`

returns the `(i, j, …)`

th element of the empirical probability mass function (pmf). The following convencience methods are defined:

`frequencies(c; dims)`

returns the multivariate raw counts along the given `dims (default to all available dimensions).`probabilities(c; dims)`

returns a multidimensional empirical probability mass function (pmf) along the given`dims`

(defaults to all available dimensions), i.e. the normalized counts.`probabilities(c, i::Int)`

returns the marginal probabilities for the`i`

-th dimension.`outcomes(c, i::Int)`

returns the marginal outcomes for the`i`

-th dimension.

**Ordering**

The ordering of outcomes are internally consistent, but we make no promise on the ordering of outcomes relative to the input data. This means that if your input data are `x = rand(["yes", "no"], 100); y = rand(["small", "medium", "large"], 100)`

, you'll get a 2-by-3 contingency matrix, but there currently no easy way to determine which outcome the i-j-th row/column of this matrix corresponds to.

Since `ContingencyMatrix`

is intended for use in information theoretic methods that don't care about ordering, as long as the ordering is internally consistent, this is not an issue for practical applications in this package. This may change in future releases.

**Usage**

Contingency matrices is used in the computation of discrete versions of the following quantities:

`CausalityTools.ContinuousDefinition`

— Type`ContinuousDefinition <: SystemDefinition`

The supertype of all continuous system definitions.

`CausalityTools.ConvergentCrossMapping`

— Type```
ConvergentCrossMapping <: CrossmapMeasure
ConvergentCrossMapping(; d::Int = 2, τ::Int = -1, w::Int = 0,
f = Statistics.cor, embed_warn = true)
```

The convergent cross mapping (CCM) measure Sugihara2012).

Specifies embedding dimension `d`

, embedding lag `τ`

to be used, as described below, with `predict`

or `crossmap`

. The Theiler window `w`

controls how many temporal neighbors are excluded during neighbor searches (`w = 0`

means that only the point itself is excluded). `f`

is a function that computes the agreement between observations and predictions (the default, `f = Statistics.cor`

, gives the Pearson correlation coefficient).

**Embedding**

Let `S(i)`

be the source time series variable and `T(i)`

be the target time series variable. This version produces regular embeddings with fixed dimension `d`

and embedding lag `τ`

as follows:

\[( S(i), S(i+\tau), S(i+2\tau), \ldots, S(i+(d-1)\tau, T(i))_{i=1}^{N-(d-1)\tau}.\]

In this joint embedding, neighbor searches are performed in the subspace spanned by the first `D-1`

variables, while the last (`D`

-th) variable is to be predicted.

With this convention, `τ < 0`

implies "past/present values of source used to predict target", and `τ > 0`

implies "future/present values of source used to predict target". The latter case may not be meaningful for many applications, so by default, a warning will be given if `τ > 0`

(`embed_warn = false`

turns off warnings).

`CausalityTools.Copula`

— Type```
Copula <: MutualInformationEstimator
Copula(; est = Kraskov(k = 5), exact = false)
```

A non-parametric copula-based mutual information estimator.

It is typically many times faster to compute mutual information using `Copula`

than with other `MutualInformationEstimator`

s, `DifferentialEntropyEstimator`

s, or `ProbabilitiesEstimator`

s, because `Copula`

only needs to compute the entropy of a single (multivariate) variable, whereas the other methods explicitly computes the entropy of several variables.

If `exact == true`

, then the exact empirical cumulative distribution function (ECDF) is used to compute the empirical copula. If `exact == false`

, then a fast sorting-based approximation to the exact ECDF is computed instead (this breaks ties arbitrarily, so be careful when applying it to categorical/integer-valued data).

**Description**

Assume we have two `Dy`

-dimensional and `Dy`

-dimensional input `StateSpaceSet`

s `x`

and `y`

, both containing `N`

observations. We can define the `Dx + Dy`

-dimensional joint StateSpaceSet `D = [Dx Dy]`

. `Copula`

returns the negative *copula entropy* of `D`

, which is equal to the mutual information between `Dx`

and `Dy`

(Ma2011; Pál2010).

`CausalityTools.CorrTest`

— Type```
CorrTest <: IndependenceTest
CorrTest()
```

An independence test based correlation (for two variables) and partial correlation (for three variables) (Levy1978@; as described in Schmidt2018).

Uses `PearsonCorrelation`

and `PartialCorrelation`

internally.

Assumes that the input data are (multivariate) normally distributed. Then `ρ(X, Y) = 0`

implies `X ⫫ Y`

and `ρ(X, Y | 𝐙) = 0`

implies `X ⫫ Y | 𝐙`

.

**Description**

The null hypothesis is `H₀ := ρ(X, Y | 𝐙) = 0`

. We use the approach in Levy & Narula (1978)Levy1978 and compute the Z-transformation of the observed (partial) correlation coefficient $\hat{\rho}_{XY|\bf{Z}}$:

\[Z(\hat{\rho}_{XY|\bf{Z}}) = \log\dfrac{1 + \hat{\rho}_{XY|\bf{Z}}}{1 - \hat{\rho}_{XY|\bf{Z}}}.\]

To test the null hypothesis against the alternative hypothesis `H₁ := ρ(X, Y | 𝐙) > 0`

, calculate

\[\hat{Z} = \dfrac{1}{2}\dfrac{Z(\hat{\rho}_{XY|\bf{Z}}) - Z(0)}{\sqrt{1/(n - d - 3)}},\]

and compute the two-sided p-value (Schmidt et al., 2018)

\[p(X, Y | \bf{Z}) = 2(1 - \phi(\sqrt{n - d - 3}Z(\hat{\rho}_{XY|\bf{Z}}))),\]

where $d$ is the dimension of $\bf{Z}$ and $n$ is the number of samples. For the pairwise case, the procedure is identical, but set $\bf{Z} = \emptyset$.

**Examples**

`CausalityTools.CorrTestResult`

— Type`CorrTestResult(pvalue, ρ, z)`

A simple struct that holds the results of a `CorrTest`

test: the (partial) correlation coefficient `ρ`

, Fisher's `z`

, and `pvalue`

- the two-sided p-value for the test.

`CausalityTools.CrossmapEstimator`

— Type`CrossmapEstimator{LIBSIZES, RNG}`

A parametric supertype for all cross-map estimators, which are used with `predict`

and `crossmap`

.

Because the type of the library may differ between estimators, and because RNGs from different packages may be used, subtypes must implement the `LIBSIZES`

and `RNG`

type parameters.

For efficiency purposes, subtypes may contain mutable containers that can be re-used for ensemble analysis (see `Ensemble`

).

**Libraries**

A cross-map estimator uses the concept of "libraries". A library is essentially just a reference to a set of points, and usually, a library refers to *indices* of points, not the actual points themselves.

For example, for timeseries, `RandomVectors(libsizes = 50:25:100)`

produces three separate libraries, where the first contains 50 randomly selected time indices, the second contains 75 randomly selected time indices, and the third contains 100 randomly selected time indices. This of course assumes that all quantities involved can be indexed using the same time indices, meaning that the concept of "library" only makes sense *after* relevant quantities have been *jointly* embedded, so that they can be jointly indexed. For non-instantaneous prediction, the maximum possible library size shrinks with the magnitude of the index/time-offset for the prediction.

For spatial analyses (not yet implemented), indices could be more complex and involve multi-indices.

`CausalityTools.CrossmapMeasure`

— Type`CrossmapMeasure <: AssociationMeasure`

The supertype for all cross-map measures. Concrete subtypes are

`ConvergentCrossMapping`

, or`CCM`

for short.`PairwiseAsymmetricInference`

, or`PAI`

for short.

`CausalityTools.Definition`

— TypeA generic supertype for definitions of information measures (one measure may have multiple definitions).

`CausalityTools.DiksFang`

— Type```
DiksFang <: BandwidthRule
DickFang(c = 4.8)
```

A rule of thumb giving the bandwidth $h$ for kernel density estimation of entropy as $h = cn^{-\dfrac{2}{7}}$, where $n$ is the number of data points (Diks & Fang, 2017)[DiksFang2017].

[DiksFang2017]: Diks, C., & Fang, H. (2017). Detecting Granger Causality with a Nonparametric Information-Based Statistic (No. 17-03). CeNDEF Working Paper.

`CausalityTools.DiscreteDefinition`

— Type`DiscreteDefinition <: SystemDefinition`

The supertype of all discrete system definitions.

`CausalityTools.DistanceCorrelation`

— Type`DistanceCorrelation`

The distance correlation (Székely et al., 2007)^{[Székely2007]} measure quantifies potentially nonlinear associations between pairs of variables. If applied to three variables, the partial distance correlation (Székely and Rizzo, 2014)^{[Székely2014]} is computed.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for pairwise dependence. - Use with
`distance_correlation`

to compute the raw distance correlation coefficient.

A partial distance correlation `distance_correlation(X, Y, Z) = 0`

doesn't always guarantee conditional independence `X ⫫ Y | Z`

. See Székely and Rizzo (2014) for in-depth discussion.

`CausalityTools.EmbeddingTE`

— Type```
EmbeddingTE(; dS = 1, dT = 1, dTf = 1, dC = 1, τS = -1, τT = -1, ηTf = 1, τC = -1)
EmbeddingTE(opt::OptimiseTraditional, s, t, [c])
```

`EmbeddingTE`

provide embedding parameters for transfer entropy analysis using either `TEShannon`

, `TERenyi`

, or in general any subtype of `TransferEntropy`

, which in turns dictates the embedding used with `transferentropy`

.

The second method finds parameters using the "traditional" optimised embedding techniques from DynamicalSystems.jl

**Convention for generalized delay reconstruction**

We use the following convention. Let $s(i)$ be time series for the source variable, $t(i)$ be the time series for the target variable and $c(i)$ the time series for the conditional variable. To compute transfer entropy, we need the following marginals:

\[\begin{aligned} T^{+} &= \{t(i+\eta^1), t(i+\eta^2), \ldots, (t(i+\eta^{d_{T^{+}}}) \} \\ T^{-} &= \{ (t(i+\tau^0_{T}), t(i+\tau^1_{T}), t(i+\tau^2_{T}), \ldots, t(t + \tau^{d_{T} - 1}_{T})) \} \\ S^{-} &= \{ (s(i+\tau^0_{S}), s(i+\tau^1_{S}), s(i+\tau^2_{S}), \ldots, s(t + \tau^{d_{S} - 1}_{S})) \} \\ C^{-} &= \{ (c(i+\tau^0_{C}), c(i+\tau^1_{C}), c(i+\tau^2_{C}), \ldots, c(t + \tau^{d_{C} - 1}_{C})) \} \end{aligned}\]

Depending on the application, the delay reconstruction lags $\tau^k_{T} \leq 0$, $\tau^k_{S} \leq 0$, and $\tau^k_{C} \leq 0$ may be equally spaced, or non-equally spaced. The same applied to the prediction lag(s), but typically only a only a single predictions lag $\eta^k$ is used (so that $d_{T^{+}} = 1$).

For transfer entropy, traditionally at least one $\tau^k_{T}$, one $\tau^k_{S}$ and one $\tau^k_{C}$ equals zero. This way, the $T^{-}$, $S^{-}$ and $C^{-}$ marginals always contains present/past states, while the $\mathcal T$ marginal contain future states relative to the other marginals. However, this is not a strict requirement, and modern approaches that searches for optimal embeddings can return embeddings without the intantaneous lag.

Combined, we get the generalized delay reconstruction $\mathbb{E} = (T^{+}_{(d_{T^{+}})}, T^{-}_{(d_{T})}, S^{-}_{(d_{S})}, C^{-}_{(d_{C})})$. Transfer entropy is then computed as

\[\begin{aligned} TE_{S \rightarrow T | C} = \int_{\mathbb{E}} P(T^{+}, T^-, S^-, C^-) \log_{b}{\left(\frac{P(T^{+} | T^-, S^-, C^-)}{P(T^{+} | T^-, C^-)}\right)}, \end{aligned}\]

or, if conditionals are not relevant,

\[\begin{aligned} TE_{S \rightarrow T} = \int_{\mathbb{E}} P(T^{+}, T^-, S^-) \log_{b}{\left(\frac{P(T^{+} | T^-, S^-)}{P(T^{+} | T^-)}\right)}, \end{aligned}\]

Here,

- $T^{+}$ denotes the $d_{T^{+}}$-dimensional set of vectors furnishing the future states of $T$ (almost always equal to 1 in practical applications),
- $T^{-}$ denotes the $d_{T}$-dimensional set of vectors furnishing the past and present states of $T$,
- $S^{-}$ denotes the $d_{S}$-dimensional set of vectors furnishing the past and present of $S$, and
- $C^{-}$ denotes the $d_{C}$-dimensional set of vectors furnishing the past and present of $C$.

**Keyword arguments**

`dS`

,`dT`

,`dC`

,`dTf`

(`f`

for*future*) are the dimensions of the $S^{-}$, $T^{-}$, $C^{-}$ and $T^{+}$ marginals. The parameters`dS`

,`dT`

,`dC`

and`dTf`

must each be a*positive*integer number.`τS`

,`τT`

,`τC`

are the embedding lags for $S^{-}$, $T^{-}$, $C^{-}$. Each parameter are integers`∈ 𝒩⁰⁻`

, or a vector of integers`∈ 𝒩⁰⁻`

, so that $S^{-}$, $T^{-}$, $C^{-}$ always represents present/past values. If e.g.`τT`

is an integer, then for the $T^-$ marginal is constructed using lags $\tau_{T} = \{0, \tau, 2\tau, \ldots, (d_{T}- 1)\tau_T \}$. If is a vector, e.g.`τΤ = [-1, -5, -7]`

, then the dimension`dT`

must match the lags, and precisely those lags are used: $\tau_{T} = \{-1, -5, -7 \}$.- The prediction lag(s)
`ηTf`

is a positive integer. Combined with the requirement that the other delay parameters are zero or negative, this ensures that we're always predicting from past/present to future. In typical applications,`ηTf = 1`

is used for transfer entropy.

**Examples**

Say we wanted to compute the Shannon transfer entropy $TE^S(S \to T) = I^S(T^+; S^- | T^-)$. Using some modern procedure for determining optimal embedding parameters using methods from DynamicalSystems.jl, we find that the optimal embedding of $T^{-}$ is three-dimensional and is given by the lags `[0, -5, -8]`

. Using the same procedure, we find that the optimal embedding of $S^{-}$ is two-dimensional with lags $[-1, -8]$. We want to predicting a univariate version of the target variable one time step into the future (`ηTf = 1`

). The total embedding is then the set of embedding vectors

$E_{TE} = \{ (T(i+1), S(i-1), S(i-8), T(i), T(i-5), T(i-8)) \}$. Translating this to code, we get:

```
using CausalityTools
julia> EmbeddingTE(dT=3, τT=[0, -5, -8], dS=2, τS=[-1, -4], ηTf=1)
# output
EmbeddingTE(dS=2, dT=3, dC=1, dTf=1, τS=[-1, -4], τT=[0, -5, -8], τC=-1, ηTf=1)
```

`CausalityTools.Ensemble`

— Type`Ensemble(; measure::CrossmapMeasure, est::CrossmapEstimator, nreps::Int = 100)`

A directive to compute an ensemble analysis, where `measure`

(e.g. `ConvergentCrossMapping`

) is computed using the given estimator `est`

(e.g. `RandomVectors`

)

`CausalityTools.ExpandingSegment`

— Type```
ExpandingSegment <: CrossmapEstimator
ExpandingSegment(; libsizes::Int, rng = Random.default_rng())
```

Indicatates that cross mapping is performed on a contiguous time series segment/window, starting from the first available data point up to the `L`

th data point.

If used in an ensemble setting, the estimator is applied to time indices `Lmin:step:Lmax`

of the joint embedding.

`CausalityTools.FPVP`

— Type```
FPVP <: ConditionalMutualInformationEstimator
FPVP(k = 1, w = 0)
```

The Frenzel-Pompe-Vejmelka-Paluš (or `FPVP`

for short) estimator is used to estimate the differential conditional mutual information using a `k`

-th nearest neighbor approach that is analogous to that of the `KraskovStögbauerGrassberger1`

mutual information estimator (Frenzel2007; Vejmelka2008).

`w`

is the Theiler window, which controls the number of temporal neighbors that are excluded during neighbor searches.

`CausalityTools.GaoKannanOhViswanath`

— Type```
GaoKannanOhViswanath <: MutualInformationEstimator
GaoKannanOhViswanath(; k = 1, w = 0)
```

The `GaoKannanOhViswanath`

(Shannon) estimator is designed for estimating mutual information between variables that may be either discrete, continuous or a mixture of both GaoKannanOhViswanath2017.

Even though the `GaoKannanOhViswanath`

estimator is designed to handle discrete data, our implementation demands that all input data are `StateSpaceSet`

s whose data points are floats. If you have discrete data, such as strings or symbols, encode them using integers and convert those integers to floats before passing them to `mutualinfo`

.

**Description**

The estimator starts by expressing mutual information in terms of the Radon-Nikodym derivative, and then estimates these derivatives using `k`

-nearest neighbor distances from empirical samples.

The estimator avoids the common issue of having to add noise to data before analysis due to tied points, which may bias other estimators. Citing their paper, the estimator *"strongly outperforms natural baselines of discretizing the mixed random variables (by quantization) or making it continuous by adding a small Gaussian noise."*

In GaoKannanOhViswanath2017, they claim (roughly speaking) that the estimator reduces to the `KraskovStögbauerGrassberger1`

estimator for continuous-valued data. However, `KraskovStögbauerGrassberger1`

uses the digamma function, while `GaoKannanOhViswanath`

uses the logarithm instead, so the estimators are not exactly equivalent for continuous data.

Moreover, in their algorithm 1, it is clearly not the case that the method falls back on the `KSG1`

approach. The `KSG1`

estimator uses `k`

-th neighbor distances in the *joint* space, while the `GaoKannanOhViswanath`

algorithm selects the maximum `k`

-th nearest distances among the two marginal spaces, which are in general not the same as the `k`

-th neighbor distance in the joint space (unless both marginals are univariate). Therefore, our implementation here differs slightly from algorithm 1 in `GaoKannanOhViswanath`

. We have modified it in a way that mimics `KraskovStögbauerGrassberger1`

for continous data. Note that because of using the `log`

function instead of `digamma`

, there will be slight differences between the methods. See the source code for more details.

See also: `mutualinfo`

.

`CausalityTools.GaoOhViswanath`

— Type`GaoOhViswanath <: MutualInformationEstimator`

The `GaoOhViswanath`

mutual information estimator, also called the bias-improved-KSG estimator, or BI-KSG, by Gao2018, is given by

\[\begin{align*} \hat{H}_{GAO}(X, Y) &= \hat{H}_{KSG}(X) + \hat{H}_{KSG}(Y) - \hat{H}_{KZL}(X, Y) \\ &= \psi{(k)} + \log{(N)} + \log{ \left( \dfrac{c_{d_{x}, 2} c_{d_{y}, 2}}{c_{d_{x} + d_{y}, 2}} \right) } - \\ & \dfrac{1}{N} \sum_{i=1}^N \left( \log{(n_{x, i, 2})} + \log{(n_{y, i, 2})} \right) \end{align*},\]

where $c_{d, 2} = \dfrac{\pi^{\frac{d}{2}}}{\Gamma{(\dfrac{d}{2} + 1)}}$ is the volume of a $d$-dimensional unit $\mathcal{l}_2$-ball.

`CausalityTools.GaussianCMI`

— Type```
GaussianCMI <: MutualInformationEstimator
GaussianCMI(; normalize::Bool = false)
```

`GaussianCMI`

is a parametric estimator for Shannon conditional mutual information (CMI) Vejmelka2008.

**Description**

`GaussianCMI`

estimates Shannon CMI through a sum of two mutual information terms that each are estimated using `GaussianMI`

(the `normalize`

keyword is the same as for `GaussianMI`

):

\[\hat{I}_{Gaussian}(X; Y | Z) = \hat{I}_{Gaussian}(X; Y, Z) - \hat{I}_{Gaussian}(X; Z)\]

`CausalityTools.GaussianMI`

— Type```
GaussianMI <: MutualInformationEstimator
GaussianMI(; normalize::Bool = false)
```

`GaussianMI`

is a parametric estimator for Shannon mutual information.

**Description**

Given $d_x$-dimensional and $d_y$-dimensional input data `X`

and `Y`

, `GaussianMI`

first constructs the $d_x + d_y$-dimensional joint `StateSpaceSet`

`XY`

. If `normalize == true`

, then we follow the approach in Vejmelka & Palus (2008)Vejmelka2008 and transform each column in `XY`

to have zero mean and unit standard deviation. If `normalize == false`

, then the algorithm proceeds without normalization.

Next, the `C_{XY}`

, the correlation matrix for the (normalized) joint data `XY`

is computed. The mutual information estimate `GaussianMI`

assumes the input variables are distributed according to normal distributions with zero means and unit standard deviations. Therefore, given $d_x$-dimensional and $d_y$-dimensional input data `X`

and `Y`

, `GaussianMI`

first constructs the joint `StateSpaceSet`

`XY`

, then transforms each column in `XY`

to have zero mean and unit standard deviation, and finally computes the `\Sigma`

, the correlation matrix for `XY`

.

The mutual information estimated (for `normalize == false`

) is then estimated as

\[\hat{I}^S_{Gaussian}(X; Y) = \dfrac{1}{2} \dfrac{ \det(\Sigma_X) \det(\Sigma_Y)) }{\det(\Sigma))},\]

where we $\Sigma_X$ and $\Sigma_Y$ appear in $\Sigma$ as

\[\Sigma = \begin{bmatrix} \Sigma_{X} & \Sigma^{'}\\ \Sigma^{'} & \Sigma_{Y} \end{bmatrix}.\]

If `normalize == true`

, then the mutual information is estimated as

\[\hat{I}^S_{Gaussian}(X; Y) = -\dfrac{1}{2} \sum_{i = 1}^{d_x + d_y} \sigma_i,\]

where $\sigma_i$ are the eigenvalues for $\Sigma$.

`CausalityTools.GenericKernel`

— Type```
GenericKernel <: DifferentialEntropyEstimator
GenericKernel(bandwidth = Silverman(), kernel::MultivariateKernel = NormalIsotropic())
```

A generic, multivariate plug-in estimator for entropies based on kernel density estimation (KDE) that can in principle be used to compute any differential entropy.

Data should be standardized to zero mean and unit variance before applying `GenericKernel`

.

The `bandwidth`

may be set manually, or to one of the rule-of-thumbs listed below.

**Description**

Assume we have samples $\{x_1, x_2, \ldots, x_N \}$ from a continuous random variable $X \in \mathbb{R}^d$ with support $\mathcal{X}$ and density function $f : \mathbb{R}^d \to \mathbb{R}$.

`GenericKernel`

estimates, for each $x_i$ in the sample, the point-wise densities $\hat{f}(x_i)$ using the given `kernel`

and `bandwidth`

, then computes a resubstitution estimate for the entropy. We support the following resubstitution estimates.

**Shannon differential entropy**

\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))]\]

is estimated by replacing the expectation with the sample average (Diks2017)

\[\hat{H}(X) = -\dfrac{1}{N}\sum_{i = 1}^N \log \hat{f}(x).\]

**Compatible kernels**

**Bandwidth rule-of-thumbs**

Diks2017. Diks, C., & Fang, H. (2017). Transfer entropy for nonparametric granger causality detection: an evaluation of different resampling methods. Entropy, 19(7), 372.

`CausalityTools.GraphAlgorithm`

— Type`GraphAlgorithm`

The supertype of all causal graph inference algorithms.

**Concrete implementations**

`OCE`

. The optimal causation entropy algorithm for time series graphs.

`CausalityTools.HMeasure`

— Type```
HMeasure <: AssociationMeasure
HMeasure(; K::Int = 2, dx = 2, dy = 2, τx = - 1, τy = -1, w = 0)
```

The `HMeasure`

Arnhold1999 is a pairwise association measure. It quantifies the probability with which close state of a target timeseries/embedding are mapped to close states of a source timeseries/embedding.

Note that `τx`

and `τy`

are negative by convention. See docstring for `SMeasure`

for an explanation.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for directional dependence. - Use with
`h_measure`

to compute the raw h-measure statistic.

**Description**

The `HMeasure`

Arnhold1999 is similar to the `SMeasure`

, but the numerator of the formula is replaced by $R_i(x)$, the mean squared Euclidean distance to *all other points*, and there is a $\log$-term inside the sum:

\[H^{(k)}(x|y) = \dfrac{1}{N} \sum_{i=1}^{N} \log \left( \dfrac{R_i(x)}{R_i^{(k)}(x|y)} \right).\]

Parameters are the same and $R_i^{(k)}(x|y)$ is computed as for `SMeasure`

.

`CausalityTools.Henon2`

— Type```
Henon2() <: DiscreteDefinition
Henon2(;u₀ = [0.1, 0.2, 0.2, 0.3], c_xy = 2.0)
```

A bivariate system consisting of two identical 1D Henon maps with unidirectional forcing $X \to Y$ (Krakovská et al., 2018)^{[Krakovská2018]}.

**Equations of motion**

The equations of motion are

\[\begin{align*} x_1(t+1) &= 1.4 - x_1^2(t) + 0.3x_2(t) \\ x_2(t+1) &= x_1(t) \\ y_1(t+1) &= 1.4 - [c_{xy} x_1(t) y_1(t) + (1-c_{xy}) y_1^2(t)] + 0.3 y_2(t) \\ y_2(t+1) &= y_1(t) \end{align*}\]

`CausalityTools.Henon3`

— Type```
Henon3() <: DiscreteDefinition
Henon3(; a = 0.1, b = 0.3, c = 0.1, xi = [0.1, 0.2, 0.3])
```

`Henon3`

is a `DiscreteDefinition`

definition for a lagged discrete dynamical system consisting of three coupled 1D Henon maps Papana2013.

**Equations of motion**

\[\begin{align*} x_1(t+1) &= a - x_1(t)^2 + b x_1(t-2) \\ x_2(t+1) &= a - c x_1(t) x_2(t)- (1 - c) x_2(t)^2 + b x_2(t-1) \\ x_3(t+1) &= c x_2(t) x_3(t) - (1 - c) x_3(t)^2 + b x_3(t-1) \end{align*}\]

Here $c$ is the coupling constant. The system becomes completely synchronized for $c >= 0.7$. The initial condition `xi`

is repeated over the first two time steps before iteration starts.

`CausalityTools.Hilbert`

— Type```
Hilbert(est;
source::InstantaneousSignalProperty = Phase(),
target::InstantaneousSignalProperty = Phase(),
cond::InstantaneousSignalProperty = Phase())
) <: TransferDifferentialEntropyEstimator
```

Compute transfer entropy on instantaneous phases/amplitudes of relevant signals, which are obtained by first applying the Hilbert transform to each signal, then extracting the phases/amplitudes of the resulting complex numbers Palus2014. Original time series are thus transformed to instantaneous phase/amplitude time series. Transfer entropy is then estimated using the provided `est`

on those phases/amplitudes (use e.g. `VisitationFrequency`

, or `SymbolicPermutation`

).

Details on estimation of the transfer entropy (conditional mutual information) following the phase/amplitude extraction step is not given in Palus (2014). Here, after instantaneous phases/amplitudes have been obtained, these are treated as regular time series, from which transfer entropy is then computed as usual.

`CausalityTools.HindmarshRose3`

— Type```
HindmarshRose3 <: ContinuousDefinition
HindmarshRose3(; xi = [0.1, 0.2, 0.3, p)
```

Initialise a Hindmarsh-Rose system, which is a model of neuronal spiking.

**Description**

\[\begin{align*} \dfrac{dx}{dt} &= y + \phi(x) - z + I \\ \dfrac{dy}{dt} &= \psi(x) - y \\ \dfrac{dz}{dt} &= r[s(x - x_R) - z], \end{align*}\]

where

\[\begin{aligned} \phi(x) &= -ax^3+bx^2 \psi(x) &= c - dx^2 \end{aligned}\]

If parameters other than the defaults are to be used, they must be provided as a vector `[a, b, c, d, r, s, xᵣ, I]`

.

`CausalityTools.Ikeda2`

— Type```
Ikeda2 <: DiscreteDefinition
Ikeda2(; xi = [0.19, 0.21], c_xy = 1.0, c_yx = 1.0, a = 0.8, b = 12, c = 0.9,
r₁ = 0.2, r₂ = 0.15, σ = 0.05, rng = Random.default_rng())
```

Initialise a discrete two-dimensional Ikeda map system, adapted from Cao1997, by adding a noise term and allowing the influences from $x \to y$ ($c_{xy}$) and from $y \to x$ ($c_{yx}$) to be adjusted.

The difference equations are

\[\begin{align*} x(t+1) &= 1 + \mu(x \cos{(\theta)} - c_{yx} y \sin{(\theta)}) - min( \dfrac{\sigma \xi_{t}^{(1)}}{(1-x)}, \xi_{t}^{(2)}) \\ y(t+1) &= \mu(y \cos{(\theta)} - c_{xy} x \sin{(\theta)}) - min(\dfrac{\sigma \zeta_{t}^{(1)}}{(1-y)}, \zeta_{t}^{(2)}) \end{align*}\]

`CausalityTools.IndependenceTest`

— Type`IndependenceTest <: IndependenceTest`

The supertype for all independence tests.

`CausalityTools.InformationMeasure`

— Type`InformationMeasure <: AssociationMeasure`

The supertype for all definitions of information-based measures.

**Why use definitions?**

Several information measures, such as mutual information, come in several forms depending on what type of generalized entropy they are defined with respect to. For example, there are at least three forms of Rényi mutual informations.

In CausalityTools.jl, each unique variant of a measure is a subtype of `InformationMeasure`

. For example, `MITsallisFuruichi`

gives the formula for Furuichi (2006)'s Rényi-based mutual information.

**Implemented measures**

**Mutual informations**

`MIShannon`

. Discrete Shannon mutual information.`MITsallisFuruichi`

. Discrete Tsallis mutual information, as defined by Furuichi (2006).

**Conditional mutual information (CMI)**

`CMIRenyiSarbu`

. Discrete Rényi CMI.

`CausalityTools.InformationMeasureEstimator`

— TypeThe supertype for all estimators of information-based measures.

`CausalityTools.JDDTestResult`

— Type`JDDTestResult(Δjdd, hypothetical_μ, pvalue)`

Holds the results of `JointDistanceDistributionTest`

. `Δjdd`

is the `Δ`

-distribution, `hypothetical_μ`

is the hypothetical mean of the `Δ`

-distribution under the null, and `pvalue`

is the p-value for the one-sided t-test.

`CausalityTools.JointDistanceDistribution`

— Type```
JointDistanceDistribution <: AssociationMeasure end
JointDistanceDistribution(; metric = Euclidean(), B = 10, D = 2, τ = -1, μ = 0.0)
```

The joint distance distribution (JDD) measure Amigo2018.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for directional dependence. - Use with
`jdd`

to compute the joint distance distribution`Δ`

from Amigo2018

**Keyword arguments**

: An instance of a valid distance metric from`distance_metric::Metric`

`Distances.jl`

. Defaults to`Euclidean()`

.: The number of equidistant subintervals to divide the interval`B::Int`

`[0, 1]`

into when comparing the normalised distances.: Embedding dimension.`D::Int`

: Embedding delay. By convention,`τ::Int`

`τ`

is negative.: The hypothetical mean value of the joint distance distribution if there is no coupling between`μ`

`x`

and`y`

(default is`μ = 0.0`

).

**Description**

From input time series $x(t)$ and $y(t)$, we first construct the delay embeddings (note the positive sign in the embedding lags; therefore the input parameter `τ`

is by convention negative).

\[\begin{align*} \{\bf{x}_i \} &= \{(x_i, x_{i+\tau}, \ldots, x_{i+(d_x - 1)\tau}) \} \\ \{\bf{y}_i \} &= \{(y_i, y_{i+\tau}, \ldots, y_{i+(d_y - 1)\tau}) \} \\ \end{align*}\]

The algorithm then proceeds to analyze the distribution of distances between points of these embeddings, as described in Amigo2018.

**Examples**

`CausalityTools.JointDistanceDistributionTest`

— Type```
JointDistanceDistributionTest <: IndependenceTest
JointDistanceDistributionTest(measure::JointDistanceDistribution; rng = Random.default_rng())
```

An independence test for two variables based on the `JointDistanceDistribution`

Amigo2018.

When used with `independence`

, a `JDDTestResult`

is returned.

**Description**

The joint distance distribution (labelled `Δ`

in their paper) is used by Amigó & Hirata (2018) to detect directional couplings of the form $X \to Y$ or $Y \to X$. `JointDistanceDistributionTest`

formulates their method as an independence test.

Formally, we test the hypothesis $H_0$ (the variables are independent) against $H_1$ (there is directional coupling between the variables). To do so, we use a right-sided/upper-tailed t-test to check mean of `Δ`

is skewed towards positive value, i.e.

- $H_0 := \mu(\Delta) = 0$
- $H_1 := \mu(\Delta) > 0$.

When used with `independence`

, a `JDDTestResult`

is returned, which contains the joint distance distribution and a p-value. If you only need `Δ`

, use `jdd`

directly.

**Examples**

This example shows how the `JointDistanceDistributionTest`

can be used in practice.

`CausalityTools.JointEntropy`

— TypeThe supertype of all joint entropy measures.

`CausalityTools.JointEntropyRenyi`

— Type```
JointEntropyRenyi <: JointEntropy
JointEntropyRenyi(; base = 2, q = 1.5)
```

The Tsallis joint entropy measure. See docstring of `entropy_joint`

for definition.

`CausalityTools.JointEntropyShannon`

— Type```
JointEntropyShannon <: JointEntropy
JointEntropyShannon(; base = 2)
```

The Shannon joint entropy measure. See docstring of `entropy_joint`

for definition.

`CausalityTools.JointEntropyTsallis`

— Type```
JointEntropyTsallis <: JointEntropy
JointEntropyTsallis(; base = 2, q = 1.5)
```

The Tsallis joint entropy measure. See docstring of `entropy_joint`

for definition.

`CausalityTools.KraskovStögbauerGrassberger1`

— Type```
KSG1 <: MutualInformationEstimator
KraskovStögbauerGrassberger1 <: MutualInformationEstimator
KraskovStögbauerGrassberger1(; k::Int = 1, w = 0, metric_marginals = Chebyshev())
```

The `KraskovStögbauerGrassberger1`

mutual information estimator (you can use `KSG1`

for short) is the $I^{(1)}$ `k`

-th nearest neighbor estimator from Kraskov2004.

**Keyword arguments**

: The number of nearest neighbors to consider. Only information about the`k::Int`

`k`

-th nearest neighbor is actually used.: The distance metric for the marginals for the marginals can be any metric from`metric_marginals`

`Distances.jl`

. It defaults to`metric_marginals = Chebyshev()`

, which is the same as in Kraskov et al. (2004).: The Theiler window, which determines if temporal neighbors are excluded during neighbor searches in the joint space. Defaults to`w::Int`

`0`

, meaning that only the point itself is excluded.

**Description**

Let the joint StateSpaceSet $X := \{\bf{X}_1, \bf{X_2}, \ldots, \bf{X}_m \}$ be defined by the concatenation of the marginal StateSpaceSets $\{ \bf{X}_k \}_{k=1}^m$, where each $\bf{X}_k$ is potentially multivariate. Let $\bf{x}_1, \bf{x}_2, \ldots, \bf{x}_N$ be the points in the joint space $X$.

`CausalityTools.KraskovStögbauerGrassberger2`

— Type```
KSG2 <: MutualInformationEstimator
KraskovStögbauerGrassberger2 <: MutualInformationEstimator
KraskovStögbauerGrassberger2(; k::Int = 1, w = 0, metric_marginals = Chebyshev())
```

The `KraskovStögbauerGrassberger2`

mutual information estimator (you can use `KSG2`

for short) is the $I^{(2)}$ `k`

-th nearest neighbor estimator from Kraskov2004.

**Keyword arguments**

: The number of nearest neighbors to consider. Only information about the`k::Int`

`k`

-th nearest neighbor is actually used.: The distance metric for the marginals for the marginals can be any metric from`metric_marginals`

`Distances.jl`

. It defaults to`metric_marginals = Chebyshev()`

, which is the same as in Kraskov et al. (2004).: The Theiler window, which determines if temporal neighbors are excluded during neighbor searches in the joint space. Defaults to`w::Int`

`0`

, meaning that only the point itself is excluded.

**Description**

Let the joint StateSpaceSet $X := \{\bf{X}_1, \bf{X_2}, \ldots, \bf{X}_m \}$ be defined by the concatenation of the marginal StateSpaceSets $\{ \bf{X}_k \}_{k=1}^m$, where each $\bf{X}_k$ is potentially multivariate. Let $\bf{x}_1, \bf{x}_2, \ldots, \bf{x}_N$ be the points in the joint space $X$.

The `KraskovStögbauerGrassberger2`

estimator first locates, for each $\bf{x}_i \in X$, the point $\bf{n}_i \in X$, the `k`

-th nearest neighbor to $\bf{x}_i$, according to the maximum norm (`Chebyshev`

metric). Let $\epsilon_i$ be the distance $d(\bf{x}_i, \bf{n}_i)$.

Consider $x_i^m \in \bf{X}_m$, the $i$-th point in the marginal space $\bf{X}_m$. For each $\bf{x}_i^m$, we determine $\theta_i^m$ := the number of points $\bf{x}_k^m \in \bf{X}_m$ that are a distance less than $\epsilon_i$ away from $\bf{x}_i^m$. That is, we use the distance from a query point $\bf{x}_i \in X$ (in the *joint* space) to count neighbors of $x_i^m \in \bf{X}_m$ (in the marginal space).

Mutual information between the variables $\bf{X}_1, \bf{X_2}, \ldots, \bf{X}_m$ is then estimated as

\[\hat{I}_{KSG2}(\bf{X}) = \psi{(k)} - \dfrac{m - 1}{k} + (m - 1)\psi{(N)} - \dfrac{1}{N} \sum_{i = 1}^N \sum_{j = 1}^m \psi{(\theta_i^j + 1)}\]

`CausalityTools.LMeasure`

— Type```
LMeasure <: AssociationMeasure
LMeasure(; K::Int = 2, dx = 2, dy = 2, τx = - 1, τy = -1, w = 0)
```

The `LMeasure`

Chicharro2009 is a pairwise association measure. It quantifies the probability with which close state of a target timeseries/embedding are mapped to close states of a source timeseries/embedding.

Note that `τx`

and `τy`

are negative by convention. See docstring for `SMeasure`

for an explanation.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for directional dependence. - Use with
`l_measure`

to compute the raw l-measure statistic.

**Description**

`LMeasure`

is similar to `MMeasure`

, but uses distance ranks instead of the raw distances.

Let $\bf{x_i}$ be an embedding vector, and let $g_{i,j}$ denote the rank that the distance between $\bf{x_i}$ and some other vector $\bf{x_j}$ in a sorted ascending list of distances between $\bf{x_i}$ and $\bf{x_{i \neq j}}$ In other words, $g_{i,j}$ this is just the $N-1$ nearest neighbor distances sorted )

`LMeasure`

is then defined as

\[L^{(k)}(x|y) = \dfrac{1}{N} \sum_{i=1}^{N} \log \left( \dfrac{G_i(x) - G_i^{(k)}(x|y)}{G_i(x) - G_i^k(x)} \right),\]

where $G_i(x) = \frac{N}{2}$ and $G_i^K(x) = \frac{k+1}{2}$ are the mean and minimal rank, respectively.

The $y$-conditioned mean rank is defined as

\[G_i^{(k)}(x|y) = \dfrac{1}{K}\sum_{j=1}^{K} g_{i,w_{i, j}},\]

where $w_{i,j}$ is the index of the $j$-th nearest neighbor of $\bf{y_i}$.

`CausalityTools.LaggedDiscreteDefinition`

— Type`LaggedDiscreteDefinition <: SystemDefinition`

The supertype of definitions for discrete systems with lag larger than 1.

Why is this type needed? Ideally, an additional definition shouldn't be needed, because we should in principle be able to use `DiscreteDynamicalSystem`

directly for all systems. However, `DiscreteDynamicalSystem`

doesn't work for systems with memory beyond a single time lag. For example, autoregressive systems of order larger than one are not representable using `DiscreteDynamicalSystem`

.

Concrete subtypes of `DiscreteDefinition`

are *parameter containers* that are passed on to `DiscreteDynamicalSystem`

. They allocate mutable containers that keep track of past states of state variables that require it. Use `system`

to generate a `DiscreteDynamicalSystem`

that can be used with `trajectory`

.

**Implementation details**

Concrete implementations must fulfill the below criteria.

- Subtypes must implement a
`past_states`

field, which is a`SVector{N, MVector{L, Int}}`

, where`N`

is the number of variables. For type stability,`L`

states are tracked for*all*variables, even though the maximum lag may only occur for one of the variables. - The first type parameter of subtypes must be
`P`

, which keeps track of the type of`past_states`

.

For an example, see the source code for `Peguin2`

.

`CausalityTools.LeonenkoProzantoSavani`

— Type```
LeonenkoProzantoSavani <: DifferentialEntropyEstimator
LeonenkoProzantoSavani(k = 1, w = 0)
```

The `LeonenkoProzantoSavani`

estimator computes the `Shannon`

, `Renyi`

, or `Tsallis`

`entropy`

using the `k`

-th nearest-neighbor approach from LeonenkoProsantoSavani2008.

`w`

is the Theiler window, which determines if temporal neighbors are excluded during neighbor searches (defaults to `0`

, meaning that only the point itself is excluded when searching for neighbours).

`CausalityTools.Lindner`

— Type```
Lindner <: TransferEntropyEstimator
Lindner(k = 1, w = 0, base = 2)
```

The `Lindner`

transfer entropy estimator Lindner2011, which is also used in the Trentool MATLAB toolbox, and is based on nearest neighbor searches.

`w`

is the Theiler window, which determines if temporal neighbors are excluded during neighbor searches (defaults to `0`

, meaning that only the point itself is excluded when searching for neighbours).

The estimator can be used both for pairwise and conditional transfer entropy estimation.

**Description**

For a given points in the joint embedding space `jᵢ`

, this estimator first computes the distance `dᵢ`

from `jᵢ`

to its `k`

-th nearest neighbor. Then, for each point `mₖ[i]`

in the `k`

-th marginal space, it counts the number of points within radius `dᵢ`

.

The transfer entropy is then computed as

\[TE(X \to Y) = \psi(k) + \dfrac{1}{N} \sum_{i}^n \left[ \sum_{k=1}^3 \left( \psi(m_k[i] + 1) \right) \right],\]

where the index `k`

references the three marginal subspaces `T`

, `TTf`

and `ST`

for which neighbor searches are performed. Here this estimator has been modified to allow for conditioning too (a simple modification to Lindner2011's equation 5 and 6).

`CausalityTools.LocalLikelihood`

— Type```
LocalLikelihood <: ProbabilitiesEstimator
LocalLikelihood(k = 5, w = 0, metric = Euclidean())
```

The `LocalLikelihood`

estimator estimates the density around a given query point by a Gaussian kernel informed by the local mean and covariance.

To form probabilities from the pointwise density estimates, the densities are simply sum-normalized to 1.

**Outcome space**

The `outcome_space`

for `LocalLikelihood`

is the set of input points.

`CausalityTools.LocalPermutationClosenessSearch`

— TypeThe supertype of all types indicating a way of determining "closeness" for the local permutation algorithm.

`CausalityTools.LocalPermutationTest`

— Type```
LocalPermutationTest <: IndependenceTest
LocalPermutationTest(measure, [est];
kperm::Int = 5,
nshuffles::Int = 100,
rng = Random.default_rng(),
replace = true,
w::Int = 0,
show_progress = false)
```

`LocalPermutationTest`

is a generic conditional independence test Runge2018LocalPerm for assessing whether two variables `X`

and `Y`

are conditionally independendent given a third variable `Z`

(all of which may be multivariate).

When used with `independence`

, a `LocalPermutationTestResult`

is returned.

**Description**

This is a generic one-sided hypothesis test that checks whether `X`

and `Y`

are independent (given `Z`

, if provided) based on resampling from a null distribution assumed to represent independence between the variables. The null distribution is generated by repeatedly shuffling the input data in some way that is intended to break any dependence between `x`

and `y`

, but preserve dependencies between `x`

and `z`

.

The algorithm is as follows:

- Compute the original conditional independence statistic
`I(X; Y | Z)`

. - Allocate a scalar valued vector
`Î`

with space for`nshuffles`

elements. - For
`k ∈ [1, 2, …, nshuffles]`

, repeat- For each
`zᵢ ∈ Y`

, let`nᵢ`

be time indices of the`kperm`

nearest neighbors of`zᵢ`

, excluding the`w`

nearest neighbors of`zᵢ`

from the neighbor query (i.e`w`

is the Theiler window). - Let
`xᵢ⋆ = X[j]`

, where`j`

is randomly sampled from`nᵢ`

with replacement. This way,`xᵢ`

is replaced with`xⱼ`

only if`zᵢ ≈ zⱼ`

(`zᵢ`

and`zⱼ`

are close). Repeat for`i = 1, 2, …, n`

and obtain the shuffled`X̂ = [x̂₁, x̂₂, …, x̂ₙ]`

. - Compute the conditional independence statistic
`Iₖ(X̂; Y | Z)`

. - Let
`Î[k] = Iₖ(X̂; Y | Z)`

.

- For each
- Compute the p-value as
`count(Î[k] .<= I) / nshuffles)`

.

In additional to the conditional variant from Runge (2018), we also provide a pairwise version, where the shuffling procedure is identical, except neighbors in `Y`

are used instead of `Z`

and we `I(X; Y)`

and `Iₖ(X̂; Y)`

instead of `I(X; Y | Z)`

and `Iₖ(X̂; Y | Z)`

.

**Compatible measures**

Measure | Pairwise | Conditional | Requires `est` | Note |
---|---|---|---|---|

`PartialCorrelation` | ✖ | ✓ | No | |

`DistanceCorrelation` | ✖ | ✓ | No | |

`CMIShannon` | ✖ | ✓ | Yes | |

`TEShannon` | ✓ | ✓ | Yes | Pairwise tests not possible with `TransferEntropyEstimator` s, only lower-level estimators, e.g. `FPVP` , `GaussianMI` or `Kraskov` |

`PMI` | ✖ | ✓ | Yes |

The `LocalPermutationTest`

is only defined for conditional independence testing. Exceptions are for measures like `TEShannon`

, which use conditional measures under the hood even for their pairwise variants, and are therefore compatible with `LocalPermutationTest`

.

The nearest-neighbor approach in Runge (2018) can be reproduced by using the `CMIShannon`

measure with the `FPVP`

estimator.

**Examples**

`CausalityTools.LocalPermutationTestResult`

— Type`LocalPermutationTestResult(m, m_surr, pvalue)`

Holds the result of a `LocalPermutationTest`

. `m`

is the measure computed on the original data. `m_surr`

is a vector of the measure computed on permuted data, where `m_surr[i]`

is the measure compute on the `i`

-th permutation. `pvalue`

is the one-sided `p`

-value for the test.

`CausalityTools.LoftsGaarden`

— Type`Loftsgaarden <: ProbabilitiesEstimator`

The `Loftsgaarden`

probabilities estimator is based on the `k`

-th nearest neighbor density estimatio from Loftsgaarden & Quesenberry (1965).

It estimates probabilities by first estimating the density locally at each sample point `xᵢ`

using the distance from `xᵢ`

to its `k`

-th nearest neighbor. The density distribution over the sample points is then normalized to form probabilities.

**Outcome space**

The outcome space `Ω`

for `LoftsGaarden`

is the indices of the input data, `1:length(x)`

. The reason to not return the data points themselves is because duplicate data points may not have same probabilities (due to having different neighbors).

`CausalityTools.Logistic2Bidir`

— Type```
Logistic2Bidir() <: DiscreteDefinition
Logistic2Bidir(; ui = [0.5, 0.5], c_xy = 0.1, c_yx = 0.1, r₁ = 3.78, r₂ = 3.66,
σ_xy = 0.05, σ_yx = 0.05,
rng = Random.default_rng())
```

A bivariate system consisting of two 1D noisy logistic maps which are bidirectionally interacting Diego2019.

**Equations of motion**

\[\begin{align*} x(t+1) &= r_1 f_{yx}^{t}(1 - f_{yx}^{t}) \\ y(t+1) &= r_2 f_{xy}^{t}(1 - f_{xy}^{t}) \\ f_{xy}^t &= \dfrac{y(t) + c_{xy}(x(t) + \sigma_{xy} \xi_{xy}^t )}{1 + c_{xy} (1 + \sigma_{xy} )} \\ f_{yx}^t &= \dfrac{x(t) + c_{yx}(y(t) + \sigma_{yx} \xi_{yx}^t )}{1 + c_{yx} (1 + \sigma_{yx} )}, \end{align*}\]

Here, the coupling strength $c_{xy}$ controls how strongly species $x$ influences species $y$, and vice versa for $c_{yx}$. To simulate time-varying influence of unobserved processes, we use the dynamical noise terms $\xi_{xy}^t$ and $\xi_{yx}^t$, drawn from a uniform distribution with support on $[0, 1]$. If $\sigma_{xy} > 0$, then the influence of $x$ on $y$ is masked by dynamical noise equivalent to $\sigma_{xy} \xi_{xy}^{t}$ at the $t$-th iteration of the map, and vice versa for $\sigma_{yx}$.

`CausalityTools.Logistic2Unidir`

— Type```
Logistic2Unidir <: DiscreteDefinition
Logistic2Unidir(; xi = [0.5, 0.5], c_xy = 0.1, σ_xy = 0.05, r₁ = 3.78, r₂ = 3.66,
rng = Random.default_rng())
```

A bivariate system consisting of two 1D noisy logistic maps which are undirectionally coupled `x → y`

Diego2019.

**Equations of motion**

The equations of motion are

\[\begin{align*} x(t+1) &= r_1 x(t)(1 - x(t)) \\ y(t+1) &= r_2 f(x,y)(1 - f(x,y)), \end{align*}\]

with

\[\begin{aligned} f(x,y) = \dfrac{y + \frac{c_{xy}(x \xi )}{2}}{1 + \frac{c_{xy}}{2}(1+ \sigma_{xy} )} \end{aligned}\]

The parameter `c_xy`

controls how strong the dynamical forcing is. If `σ > 0`

, dynamical noise masking the influence of `x`

on `y`

equivalent to $\sigma_{xy} \cdot \xi$ is added at each iteration. Here,$\xi$ is a draw from a flat distribution on $[0, 1]$. Thus, setting `σ_xy = 0.05`

is equivalent to add dynamical noise corresponding to a maximum of $5 \%$ of the possible range of values of the logistic map.

`CausalityTools.Logistic3CommonDriver`

— Type```
Logistic3CommonDriver() <: DiscreteDefinition
Logistic3CommonDriver(; u₀ = [0.1, 0.2, 0.3],
r = 4.0, σx = 0.05, σy = 0.05, σz = 0.05,
rng = Random.default_rng())
```

A discrete dynamical system consisting of three coupled 1D logistic maps representing the response of two independent dynamical variables to the forcing from a common driver Runge2018. The dynamical influence goes in the directions $Z \to X$ and $Z \to Y$.

**Equations of motion**

The equations of motion are

\[\begin{align*} x(t+1) &= (x(t)(r - r x(t) - z(t) + \sigma_x \eta_x)) \mod 1 \\ y(t+1) &= (y(t)(r - r y(t) - z(t) + \sigma_y \eta_y)) \mod 1 \\ z(t+1) &= (z(t)(r - r z(t) + \sigma_z \eta_z)) \mod 1 \end{align*}\]

Dynamical noise may be added to each of the dynamical variables by tuning the parameters `σz`

, `σx`

and `σz`

. Default values for the parameters `r₁`

, `r₂`

and `r₃`

are set such that the system exhibits chaotic behaviour, with `r₁ = r₂ = r₃ = 4`

.

`CausalityTools.Logistic4Chain`

— Type```
Logistic4Chain <: DiscreteDefinition
Logistic4Chain(; xi = rand(4),
rx = 3.9, ry = 3.6, rz = 3.6, rw = 3.8,
cxy = 0.4, cyz = 0.4, cyw = 0.35,
rng = Random.default_rng())
```

A variant of `Logistic2Bidir`

where four variables `X`

, `Y`

, `Z`

, `W`

are coupled in a chain `X → Y → Z → W`

with dynamical noise.

**Description**

The equations of motion are

\[\begin{align*} x(t+1) &= r_x x(t)(1 - x(t)) \\ y(t+1) &= r_y f_{xy}^{t}(1 - f_{xy}^{t}) \\ z(t+1) &= r_z f_{yz}^{t}(1 - f_{yz}^{t}) \\ w(t+1) &= r_w f_{zw}^{t}(1 - f_{zw}^{t}) \\ f_{xy}^t &= \dfrac{y(t) + c_{xy}(x(t) + \sigma_{xy} \xi_{xy}^t )}{1 + c_{xy} (1 + \sigma_{xy} )} \\ f_{yz}^t &= \dfrac{z(t) + c_{yz}(y(t) + \sigma_{yz} \xi_{yz}^t )}{1 + c_{yz} (1 + \sigma_{yz} )}, \\ f_{zw}^t &= \dfrac{w(t) + c_{zw}(z(t) + \sigma_{zw} \xi_{zw}^t )}{1 + c_{zw} (1 + \sigma_{zw} )}, \end{align*}\]

where `c_{xy}`

, `c_{yz}`

, `c_{zw}`

controls the coupling strength from `x`

to `y`

, `y`

to `z`

, and `z`

to `w`

, respectively. The magnitude of dynamical noise in these couplings are controlled by $\sigma_{xy}$, $\sigma_{yz}$, and $\sigma_{zw}$, respectively. $\xi_{xy}$, $\xi_{yz}$, and $\xi_{zw}$ are noise terms that each iteration are drawn from independent uniform distributions over the unit interval.

With default parameters and all dynamical noise terms set to zero, this is the system from Ye2015 (but note that for some initial conditions, this system wanders off to $\pm \infty$ for some of the variables).

`CausalityTools.LorenzBidir6`

— Type```
LorenzBidir6 <: ContinuousDefinition
LorenzBidir6(; xi = [0.1, 0.05, 0.2, 0.2, 0.25, 0.3],
c_xy = 0.2, c_yx = 0.2,
a₁ = 10, a₂ = 28, a₃ = 8/3,
b₁ = 10, b₂ = 28, b₃ = 9/3)
```

A bidirectionally coupled Lorenz-Lorenz system, where each subsystem is a 3D Lorenz system (Amigo & Hirata, 2018)^{[Amigó2018]}.

**Description**

The dynamics is generated by the following vector field

\[\begin{align*} \dot{x_1} &= -a_1 (x_1 - x_2) + c_{yx}(y_1 - x_1) \\ \dot{x_2} &= -x_1 x_3 + a_2 x_1 - x_2 \\ \dot{x_3} &= x_1 x_2 - a_3 x_3 \\ \dot{y_1} &= -b_1 (y_1 - y_2) + c_{xy} (x_1 - y_1) \\ \dot{y_2} &= -y_1 y_3 + b_2 y_1 - y_2 \\ \dot{y_3} &= y_1 y_2 - b_3 y_3 \end{align*}\]

Default values for the parameters `a₁`

, `a₂`

, `a₃`

, `b₁`

, `b₂`

, `b₃`

are as in ^{[Amigó2018]}.

`CausalityTools.LorenzForced9`

— Type```
LorenzForced9{V} <: ContinuousDefinition
LorenzForced9(; xi = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9],
c_xy = 0.1, c_yx = 0.1,
c_zx = 0.05, c_zy = 0.05,
a₁ = 10, a₂ = 28, a₃ = 8/3,
b₁ = 10, b₂ = 28, b₃ = 8/3,
c₁ = 10, c₂ = 28, c₃ = 8/3)
```

A system consisting of two bidirectionally coupled 3D Lorenz systems forced by an external 3D Lorenz system (Amigó & Hirata, 2018).

**Description**

The dynamics is generated by the following vector field

\[\begin{align*} \dot{x_1} &= - a_1 (x_1 - x_2) + c_{yx}(y_1 - x_1) + c_{zx}(z_1 - x_1) \\ \dot{x_2} &= - x_1 x_3 + a_2 x_1 - x_2 \\ \dot{x_3} &= x_1 x_2 - a_3 x_3 \\ \dot{y_1} &= -b_1 (y_1 - y_2) + c_{xy} (x_1 - y_1) + c_{zy}(z_1 - y_1) \\ \dot{y_2} &= - y_1 y_3 + b_2 y_1 - y_2 \\ \dot{y_3} &= y_1 y_2 - b_3 y_3 \\ \dot{z_1} &= - c_1 (z_1 - z_2) \\ \dot{z_2} &= - z_1 z_3 + c_2 z_1 - z_2 \\ \dot{z_3} &= z_1 z_2 - c_3 z_3 \end{align*}\]

`CausalityTools.LorenzTransitive9`

— Type```
LorenzTransitive9 <: ContinuousDefinition
LorenzTransitive9(; xi = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.9, 0.9],
σ₁ = 10.0, σ₂ = 10.0, σ₃ = 10.0,
ρ₁ = 28.0, ρ₂ = 28.0, ρ₃ = 28.0,
β₁ = 8/3, β₂ = 8/3, β₃ = 8.3,
c₁₂ = 1.0, c₂₃ = 1.0)
```

Initalise a dynamical system consisting of three coupled Lorenz attractors with a transitive causality chain where X₁ → X₂ and X₂ → X₃. In total, the three 3D-subsystems create a 9-dimensional dynamical system.

The strength of the forcing X₁ → X₂ is controlled by the parameter `c₁`

, and the forcing from X₂ → X₃ by `c₂`

. The remaining parameters are the usual parameters for the Lorenz system, where the subscript `i`

refers to the subsystem Xᵢ.

**Equations of motion**

The dynamics is generated by the following vector field

\[\begin{aligned} \dot{x_1} &= \sigma_1(y_1 - x_1) \\ \dot{y_1} &= \rho_1 x_1 - y_1 - x_1 z_1 \\ \dot{z_1} &= x_1 y_1 - \beta_1 z_1 \\ \dot{x_2} &= \sigma_2 (y_2 - x_2) + c_{12}(x_1 - x_2) \\ \dot{y_2} &= \rho_2 x_2 - y_2 - x_2 z_2 \\ \dot{z_2} &= x_2 y_2 - \beta_2 z_2 \\ \dot{x_3} &= \sigma_3 (y_3 - x_3) + c_{23} (x_2 - x_3) \\ \dot{y_3} &= \rho_3 x_3 - y_3 - x_3 z_3 \\ \dot{z_3} &= x_3 y_3 - \beta_3 z_3 \end{aligned}\]

**Usage in literature**

This system was studied by Papana et al. (2013) for coupling strengths $c_{12} = 0, 1, 3, 5$ and $c_{23} = 0, 1, 3, 5$.

`CausalityTools.MCR`

— Type```
MCR <: AssociationMeasure
MCR(; r, metric = Euclidean())
```

An association measure based on mean conditional probabilities of recurrence (MCR) introduced by Romano2007.

`r`

is mandatory keyword which specifies the recurrence threshold when constructing recurrence matrices. It can be instance of any subtype of `AbstractRecurrenceType`

from RecurrenceAnalysis.jl. To use any `r`

that is not a real number, you have to do `using RecurrenceAnalysis`

first. The `metric`

is any valid metric from Distances.jl.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for pairwise association. - Use with
`mcr`

to compute the raw MCR for pairwise association.

**Description**

For input variables `X`

and `Y`

, the conditional probability of recurrence is defined as

\[M(X | Y) = \dfrac{1}{N} \sum_{i=1}^N p(\bf{y_i} | \bf{x_i}) = \dfrac{1}{N} \sum_{i=1}^N \dfrac{\sum_{i=1}^N J_{R_{i, j}}^{X, Y}}{\sum_{i=1}^N R_{i, j}^X},\]

where $R_{i, j}^X$ is the recurrence matrix and $J_{R_{i, j}}^{X, Y}$ is the joint recurrence matrix, constructed using the given `metric`

. The measure $M(Y | X)$ is defined analogously.

Romano2007's interpretation of this quantity is that if `X`

drives `Y`

, then `M(X|Y) > M(Y|X)`

, if `Y`

drives `X`

, then `M(Y|X) > M(X|Y)`

, and if coupling is symmetric, then `M(Y|X) = M(X|Y)`

.

**Input data**

`X`

and `Y`

can be either both univariate timeseries, or both multivariate `StateSpaceSet`

s.

`CausalityTools.MIH3`

— TypeThe supertype for all H3-type (three entropies) decomposition of mutual information.

`CausalityTools.MIRenyiJizba`

— Type`MIRenyiJizba <: MutualInformation`

The Rényi mutual information $I_q^{R_{J}}(X; Y)$ defined in Jizba2012.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for pairwise dependence. - Use with
`mutualinfo`

to compute the raw mutual information.

**Definition**

\[I_q^{R_{J}}(X; Y) = S_q^{R}(X) + S_q^{R}(Y) - S_q^{R}(X, Y),\]

where $S_q^{R}(\cdot)$ and $S_q^{R}(\cdot, \cdot)$ the `Rényi`

entropy and the joint Rényi entropy.

`CausalityTools.MIRenyiSarbu`

— Type```
MIRenyiSarbu <: MutualInformation
MIRenyiSarbu(; base = 2, q = 1.5)
```

The discrete Rényi mutual information from Sarbu2014.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for pairwise dependence. - Use with
`mutualinfo`

to compute the raw mutual information.

**Description**

Sarbu (2014) defines discrete Rényi mutual information as the Rényi $\alpha$-divergence between the conditional joint probability mass function $p(x, y)$ and the product of the conditional marginals, $p(x) \cdot p(y)$:

\[I(X, Y)^R_q = \dfrac{1}{q-1} \log \left( \sum_{x \in X, y \in Y} \dfrac{p(x, y)^q}{\left( p(x)\cdot p(y) \right)^{q-1}} \right)\]

See also: `mutualinfo`

.

`CausalityTools.MIShannon`

— Type```
MIShannon <: MutualInformation
MIShannon(; base = 2)
```

The Shannon mutual information $I^S(X; Y)$.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for pairwise dependence. - Use with
`mutualinfo`

to compute the raw mutual information.

**Discrete definition**

There are many equivalent formulations of discrete Shannon mutual information. In this package, we currently use the double-sum and the three-entropies formulations.

**Double sum formulation**

Assume we observe samples $\bar{\bf{X}}_{1:N_y} = \{\bar{\bf{X}}_1, \ldots, \bar{\bf{X}}_n \}$ and $\bar{\bf{Y}}_{1:N_x} = \{\bar{\bf{Y}}_1, \ldots, \bar{\bf{Y}}_n \}$ from two discrete random variables $X$ and $Y$ with finite supports $\mathcal{X} = \{ x_1, x_2, \ldots, x_{M_x} \}$ and $\mathcal{Y} = y_1, y_2, \ldots, x_{M_y}$. The double-sum estimate is obtained by replacing the double sum

\[\hat{I}_{DS}(X; Y) = \sum_{x_i \in \mathcal{X}, y_i \in \mathcal{Y}} p(x_i, y_j) \log \left( \dfrac{p(x_i, y_i)}{p(x_i)p(y_j)} \right)\]

where $\hat{p}(x_i) = \frac{n(x_i)}{N_x}$, $\hat{p}(y_i) = \frac{n(y_j)}{N_y}$, and $\hat{p}(x_i, x_j) = \frac{n(x_i)}{N}$, and $N = N_x N_y$. This definition is used by `mutualinfo`

when called with a `ContingencyMatrix`

.

**Three-entropies formulation**

An equivalent formulation of discrete Shannon mutual information is

\[I^S(X; Y) = H^S(X) + H_q^S(Y) - H^S(X, Y),\]

where $H^S(\cdot)$ and $H^S(\cdot, \cdot)$ are the marginal and joint discrete Shannon entropies. This definition is used by `mutualinfo`

when called with a `ProbabilitiesEstimator`

.

**Differential mutual information**

One possible formulation of differential Shannon mutual information is

\[I^S(X; Y) = h^S(X) + h_q^S(Y) - h^S(X, Y),\]

where $h^S(\cdot)$ and $h^S(\cdot, \cdot)$ are the marginal and joint differential Shannon entropies. This definition is used by `mutualinfo`

when called with a `DifferentialEntropyEstimator`

.

See also: `mutualinfo`

.

`CausalityTools.MITsallisFuruichi`

— Type```
MITsallisFuruichi <: MutualInformation
MITsallisFuruichi(; base = 2, q = 1.5)
```

The discrete Tsallis mutual information from Furuichi (2006)Furuichi2006, which in that paper is called the *mutual entropy*.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for pairwise dependence. - Use with
`mutualinfo`

to compute the raw mutual information.

**Description**

Furuichi's Tsallis mutual entropy between variables $X \in \mathbb{R}^{d_X}$ and $Y \in \mathbb{R}^{d_Y}$ is defined as

\[I_q^T(X; Y) = H_q^T(X) - H_q^T(X | Y) = H_q^T(X) + H_q^T(Y) - H_q^T(X, Y),\]

where $H^T(\cdot)$ and $H^T(\cdot, \cdot)$ are the marginal and joint Tsallis entropies, and `q`

is the `Tsallis`

-parameter. ```

See also: `mutualinfo`

.

`CausalityTools.MITsallisMartin`

— Type```
MITsallisMartin <: MutualInformation
MITsallisMartin(; base = 2, q = 1.5)
```

The discrete Tsallis mutual information from Martin2004.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for pairwise dependence. - Use with
`mutualinfo`

to compute the raw mutual information.

**Description**

Martin et al.'s Tsallis mutual information between variables $X \in \mathbb{R}^{d_X}$ and $Y \in \mathbb{R}^{d_Y}$ is defined as

\[I_{\text{Martin}}^T(X, Y, q) := H_q^T(X) + H_q^T(Y) - (1 - q) H_q^T(X) H_q^T(Y) - H_q(X, Y),\]

where $H^S(\cdot)$ and $H^S(\cdot, \cdot)$ are the marginal and joint Shannon entropies, and `q`

is the `Tsallis`

-parameter.

See also: `mutualinfo`

.

`CausalityTools.MMeasure`

— Type```
MMeasure <: AssociationMeasure
MMeasure(; K::Int = 2, dx = 2, dy = 2, τx = - 1, τy = -1, w = 0)
```

The `MMeasure`

Andrzejak2003 is a pairwise association measure. It quantifies the probability with which close state of a target timeseries/embedding are mapped to close states of a source timeseries/embedding.

Note that `τx`

and `τy`

are negative by convention. See docstring for `SMeasure`

for an explanation.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for directional dependence. - Use with
`m_measure`

to compute the raw m-measure statistic.

**Description**

The `MMeasure`

is based on `SMeasure`

and `HMeasure`

. It is given by

\[M^{(k)}(x|y) = \dfrac{1}{N} \sum_{i=1}^{N} \log \left( \dfrac{R_i(x) - R_i^{(k)}(x|y)}{R_i(x) - R_i^k(x)} \right),\]

where $R_i(x)$ is computed as for `HMeasure`

, while $R_i^k(x)$ and $R_i^{(k)}(x|y)$ is computed as for `SMeasure`

. Parameters also have the same meaning as for `SMeasure`

/`HMeasure`

.

`CausalityTools.MediatedLink9`

— Type```
MediatedLink9 <: ContinuousDefinition
MediatedLink9(; xi = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9],
ωx = 1.0, ωy = 1.015, ωz = 0.985,
k = 0.15, l = 0.2, m = 10.0,
c = 0.06) → ContinuousDynamicalSystem
```

A three-subsystem dynamical system where `X`

and `Y`

are driven by `Z`

(Krakovská, 2018)^{[Krakovská2018]}.

**Description**

The dynamics is generated by the following vector field

\[\begin{aligned} dx_1 &= -\omega_x x_2 - x_3 + c*(z_1 - x_1) \\ dx_2 &= \omega_x x_1 + k*x_2 \\ dx_3 &= l + x_3(x_1 - m) \\ dy_1 &= -\omega_y y_2 - y_3 + c*(z_1 - y_1) \\ dy_2 &= \omega_y y_1 + k*y_2 \\ dy_3 &= l + y_3(y_1 - m) \\ dz_1 &= -\omega_z z_2 - z_3 \\ dz_2 &= \omega_z z_1 + k*z_2 \\ dz_3 &= l + z_3(z_1 - m) \end{aligned}\]

At the default value of the coupling constant `c = 0.06`

, the responses `X`

and `Y`

are already synchronized to the driver `Z`

.

`CausalityTools.MesnerShalizi`

— Type```
MesnerShalizi <: ConditionalMutualInformationEstimator
MesnerShalizi(k = 1, w = 0)
```

The `MesnerShalizi`

estimator is an estimator for conditional mutual information for data that can be mixtures of discrete and continuous data MesnerShalizi2020.

`CausalityTools.MutualInformation`

— Type`MutualInformation <: AssociationMeasure`

The supertype of all mutual information measures. Concrete subtypes are

`CausalityTools.MutualInformationDefinition`

— TypeThe supertype for mutual information definitions.

`CausalityTools.MutualInformationEstimator`

— Type`MutualInformationEstimator`

The supertype of all dedicated mutual information estimators.

`MutualInformationEstimator`

s can be either mixed, discrete or a combination of both. Each estimator uses a specialized technique to approximate relevant densities/integrals and/or probabilities, and is typically tailored to a specific type of `MutualInformation`

(mostly `MIShannon`

).

`CausalityTools.NeighborCloseness`

— Type`NeighborCloseness <: LocalPermutationClosenessSearch`

Determine closeness between points for `LocalPermutationTest`

using nearest neighbors searches.

`CausalityTools.Nonlinear3`

— Type```
Nonlinear3 <: DiscreteDefinition
Nonlinear3(; xi = rand(3),
σ₁ = 1.0, σ₂ = 1.0, σ₃ = 1.0,
a₁ = 3.4, a₂ = 3.4, a₃ = 3.4,
b₁ = 0.4, b₂ = 0.4, b₃ = 0.4,
c₁₂ = 0.5, c₂₃ = 0.3, c₁₃ = 0.5,
rng = Random.default_rng())
```

A 3d nonlinear system with nonlinear couplings $x_1 \to x_2$, $x_2 \to x_3$ and $x_1 \to x_3$. Modified from Gourévitch et al. (2006)[Gourévitch2006].

**Equations of motion**

\[\begin{aligned} x_1(t+1) &= a_1 x_1 (1-x_1(t))^2 e^{-x_2(t)^2} + 0.4 \xi_{1}(t) \\ x_2(t+1) &= a_1 x_2 (1-x_2(t))^2 e^{-x_2(t)^2} + 0.4 \xi_{2}(t) + b x_1 x_2 \\ x_3(t+1) &= a_3 x_3 (1-x_3(t))^2 e^{-x_3(t)^2} + 0.4 \xi_{3}(t) + c x_{2}(t) + d x_{1}(t)^2. \end{aligned}\]

[Gourévitch2006]: Gourévitch, B., Le Bouquin-Jeannès, R., & Faucon, G. (2006). Linear and nonlinear causality between signals: methods, examples and neurophysiological applications. Biological Cybernetics, 95(4), 349–369.

`CausalityTools.OCE`

— Type```
OCE <: GraphAlgorithm
OCE(; utest::IndependenceTest = SurrogateTest(MIShannon(), KSG2(k = 3, w = 3)),
ctest::C = LocalPermutationTest(CMIShannon(), MesnerShalizi(k = 3, w = 3)),
τmax::T = 1, α = 0.05)
```

The optimal causation entropy (OCE) algorithm for causal discovery Sun2015.

**Description**

The OCE algorithm has three steps to determine the parents of a variable `xᵢ`

.

- Perform pairwise independence tests using
`utest`

and select the variable`xⱼ(-τ)`

that has the highest significant (i.e. with associated p-value below`α`

) association with`xᵢ(0)`

. Assign it to the set of selected parents`P`

. - Perform conditional independence tests using
`ctest`

, finding the parent`Pₖ`

that has the highest association with`xᵢ`

given the already selected parents, and add it to`P`

. Repeat until no more variables with significant association are found. - Backwards elimination of parents
`Pₖ`

of`xᵢ(0)`

for which`xᵢ(0) ⫫ Pₖ | P - {Pₖ}`

, where`P`

is the set of parent nodes found in the previous steps.

`τmax`

indicates the maximum lag `τ`

between the target variable `xᵢ(0)`

and its potential parents `xⱼ(-τ)`

. Sun et al. 2015's method is based on `τmax = 1`

.

**Returns**

When used with `infer_graph`

, it returns a vector `p`

, where `p[i]`

are the parents for each input variable. This result can be converted to a `SimpleDiGraph`

from Graphs.jl (see example).

**Usage**

`OCE`

is used with `infer_graph`

to infer the parents of the input data. Input data must either be a `Vector{Vector{<:Real}}`

, or a `StateSpaceSet`

.

**Examples**

`CausalityTools.OCESelectedParents`

— Type`OCESelectedParents`

A simple struct for storing the parents of a single variable `xᵢ`

inferred by the `OCE`

algorithm. When using `OCE`

with `infer_graph`

, a `Vector{OCESelectedParents}`

is returned - one per variable in the input data.

**Assumptions and notation**

Assumes the input `x`

is a `Vector{Vector{<:Real}}`

or a `StateSpaceSet`

(for which each column is treated as a variable). It contains the following fields, where we use the notation `xₖ(τ)`

to indicate the `k`

-th variable lagged by time-lag `τ`

. For example, `x₂(-3)`

is the variable `x[2]`

lagged by 3 time steps.

**Fields**

`i`

: The index of the target variable (i.e.`xᵢ(0)`

is the target).`all_idxs`

: The possible*variable*indices of parent variables (i.e.`1:M`

, where`M`

is the number of input variables).`parents_js`

: The*variable*indices of the selected parent variables –- one per selected parent.`parents_τs`

: The*lags*for the selected parent variables –- one per selected parent.`parents`

: A vector containing the raw, time-lagged data for each selected parent variables. Let`τ = parents_τs[k]`

and`j = parents_js[k]`

. Then`parents[k]`

is the raw data for the variable`xⱼ(-τ)`

.

`CausalityTools.OptimiseTraditional`

— Type```
OptimiseTraditional(; η = 1, maxlag = 50, maxdim = 10,
method = delay_ifnn, dmethod = "mi_min")
```

Optimize embedding parameters using traditional delay embedding optimization methods with a maximum lag of `maxlag`

and a maximum dimension of `maxdim`

. `method`

can be either `delay_ifnn`

, `delay_fnn`

or `delay_f1nn`

.

`CausalityTools.PA`

— Type```
PA <: CausalityTools.AssociationMeasure
PA(ηT = 1:5, τS = 1, τC = 1)
```

The modified predictive asymmetry measure (Haaga et al., in revision).

This is an experimental measure. It is part of an ongoing paper submission revision, but is provided here for convenience.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for pairwise or conditional directional dependence. - Use with
`asymmetry`

to compute the raw asymmetry distribution.

**Keyword arguments**

`ηT`

. The prediction lags for the target variable.`τS`

. The embedding delay(s) for the source variable.`τC`

. The embedding delay(s) for the conditional variable(s).

All parameters are given as a single integer or multiple integers greater than zero.

**Compatible estimators**

`PA`

/`asymmetry`

uses `condmutualinfo`

under the hood. Any estimator that can be used for `ConditionalMutualInformation`

can therefore, in principle, be used with the predictive asymmetry. We recommend to use `FPVP`

, or one of the other dedicated conditional mutual information estimators.

**Examples**

`CausalityTools.PATest`

— Type```
PATest <: IndependenceTest
PATest(measure::PA, est)
```

A test for directional (conditional) dependence based on the modified `PA`

measure, estimated using the given estimator `est`

. Compatible estimators are listed in the docstring for `PA`

.

When used with `independence`

, a `PATestResult`

summary is returned.

This is an experimental test. It is part of an ongoing paper submission revision, but is provided here for convenience.

`CausalityTools.PATestResult`

— Type`PATestResult(n_vars, ΔA, ttest, pvalue)`

Holds the result of a `SurrogateTest`

. `n_vars`

is the number of variables used for the test (2 for pairwise, 3 for conditional). `ΔA`

is the distribution of asymmetries, one for each `η`

. `ttest`

is a one-sample t-test, and `pvalue`

is the right-tailed p-value for the test.

`CausalityTools.PC`

— Type```
PC <: GraphAlgorithm
PC(pairwise_test, conditional_test;
α = 0.05, max_depth = Inf, maxiters_orient = Inf)
```

The PC algorithm Spirtes2000, which is named named after the *first names* of the authors, **P**eter Spirtes and **C**lark Glymour, which is implemented as described in Kalisch2008.

**Arguments**

: An`pairwise_test`

`IndependenceTest`

that uses a pairwise, nondirectional`AssociationMeasure`

measure (e.g. a parametric`CorrTest`

, or`SurrogateTest`

with the`MIShannon`

measure).: An`conditional_test`

`IndependenceTest`

that uses a conditional, nondirectional`AssociationMeasure`

(e.g.`CorrTest`

, or`SurrogateTest`

with the`CMIShannon`

measure).

**Keyword arguments**

. The significance level of the test.`α::Real`

. The maximum level of conditional indendence tests to be performed. By default, there is no limit (i.e.`max_depth`

`max_depth = Inf`

), meaning that maximum depth is`N - 2`

, where`N`

is the number of input variables.. The maximum number of times to apply the orientation rules. By default, there is not limit (i.e.`maxiters_orient::Real`

`maxiters_orient = Inf`

).

During the skeleton search phase, if a significance association between two nodes are is found, then a bidirectional edge is drawn between them. The generic implementation of `PC`

therefore doesn't currently handle directional measures such as `TEShannon`

. The reason is that if a directional relationship `X → Y`

exists between two nodes `X`

and `Y`

, then the algorithm would first draw a bidirectional arrow between `X`

and `Y`

when analysing the direction `X → Y`

, and then removing it again when analysing in the direction `Y → X`

(a similar situation would also occur for the conditional stage). This will be fixed in a future release. For now, use nondirectional measures, e.g. `MIShannon`

and `CMIShannon`

!

**Description**

When used with `infer_graph`

on some input data `x`

, the `PC`

algorithm performs the following steps:

- Initialize an empty fully connected graph
`g`

with`N`

nodes, where`N`

is the number of variables and`x[i]`

is the data for the`i`

-th node. - Reduce the fully connected
`g`

to a skeleton graph by performing pairwise`independence`

tests between all vertices using`pairwise_test`

. Remove any edges where adjacent vertices are found to be independent according to the test (i.e. the null hypothesis of independence cannot be rejected at significance level`1 - α`

). - Thin the skeleton
`g`

by conditional`independence`

testing. If`x[i] ⫫ x[j] | x[Z]`

for some set of variables`Z`

(not including`i`

and`j`

) according to`conditional_test`

(i.e. the null hypothesis of conditional independence cannot be rejected at significance level`1 - α`

), then the edge between`i`

and`j`

is removed, and we record the separating set S(i, j) = Z. Independence tests are first performed for conditioning sets of size 1, and repeated for conditioning sets of increasing size, which in most cases limits the number of tests needed. The separating sets`S(i, j)`

, which records which variables were in the conditioning set that rendered variables`i`

and`j`

independent, are recorded. If`max_depth`

is an integer, then this procedure is performed on conditioning sets of sizes`1:max_depth`

, and if`max_depth == nothing`

, then all possible conditioning set sizes are potentially used. - Create a directed graph
`dg`

from`g`

by replacing every undirected edge`X - Y`

in`g`

by the bidirectional edge`X ↔ Y`

(i.e. construct two directional edges`X → Y`

and`Y → X`

). Orientiation rules 0-3 are then repeatedly applied to`dg`

until no more edges can be oriented:- Rule 0 (orients v-structures):
`X ↔ Y ↔ Z`

becomes`X → Y ← Z`

if`Y`

is not in the separating set`S(X, Z)`

. - Rule 1 (prevents new v-structures):
`X → Y ↔ Z`

becomes`X → Y → Z`

if`X`

and`Z`

are not adjacent. - Rule 2 (avoids cycles):
`X → Y → Z ↔ X`

becomes`X → Y → Z ← X`

. - Rule 3: To avoid creating cycles or new v-structures, whenever
`X - Y → Z`

,`X - W → Z`

, and`X - Z`

but there is no edge between`Y`

and`W`

, turn the undirected`X - Z`

edge into the directed edge`X → Z`

.

- Rule 0 (orients v-structures):

The resulting directed graph (a `SimpleDiGraph`

from Graphs.jl) is then returned.

**Examples**

`CausalityTools.PMI`

— Type```
PMI <: AssociationMeasure
PMI(; base = 2)
```

The partial mutual information (PMI) measure of association Zhao2016.

**Definition**

PMI is defined as for variables $X$, $Y$ and $Z$ as

\[PMI(X; Y | Z) = D(p(x, y, z) || p^{*}(x|z) p^{*}(y|z) p(z)),\]

where $p(x, y, z)$ is the joint distribution for $X$, $Y$ and $Z$, and $D(\cdot, \cdot)$ is the extended Kullback-Leibler divergence from $p(x, y, z)$ to $p^{*}(x|z) p^{*}(y|z) p(z)$. See Zhao et al. (2016) for details.

**Estimation**

PMI can be estimated using any `ProbabilitiesEstimator`

that implements `marginal_encodings`

. This allows estimation of 3D contingency matrices, from which relevant probabilities for the PMI formula are extracted. See also `pmi`

.

**Properties**

For the discrete case, the following identities hold in theory (when estimating PMI, they may not).

`PMI(X, Y, Z) >= CMI(X, Y, Z)`

(where CMI is the Shannon CMI). Holds in theory, but when estimating PMI, the identity may not hold.`PMI(X, Y, Z) >= 0`

. Holds both in theory and for estimation using`ProbabilitiesEstimator`

s.`X ⫫ Y | Z => PMI(X, Y, Z) = CMI(X, Y, Z) = 0`

(in theory, but not necessarily for estimation).

`CausalityTools.PairwiseAsymmetricInference`

— Type```
PairwiseAsymmetricInference <: CrossmapMeasure
PairwiseAsymmetricInference(; d::Int = 2, τ::Int = -1, w::Int = 0,
f = Statistics.cor, embed_warn = true)
```

The pairwise asymmetric inference (PAI) cross mapping measure McCracken2014) is a version of `ConvergentCrossMapping`

that searches for neighbors in *mixed* embeddings (i.e. both source and target variables included); otherwise, the algorithms are identical.

Specifies embedding dimension `d`

, embedding lag `τ`

to be used, as described below, with `predict`

or `crossmap`

. The Theiler window `w`

controls how many temporal neighbors are excluded during neighbor searches (`w = 0`

means that only the point itself is excluded). `f`

is a function that computes the agreement between observations and predictions (the default, `f = Statistics.cor`

, gives the Pearson correlation coefficient).

**Embedding**

There are many possible ways of defining the embedding for PAI. Currently, we only implement the *"add one non-lagged source timeseries to an embedding of the target"* approach, which is used as an example in McCracken & Weigel's paper. Specifically: Let `S(i)`

be the source time series variable and `T(i)`

be the target time series variable. `PairwiseAsymmetricInference`

produces regular embeddings with fixed dimension `d`

and embedding lag `τ`

as follows:

\[(S(i), T(i+(d-1)\tau, \ldots, T(i+2\tau), T(i+\tau), T(i)))_{i=1}^{N-(d-1)\tau}.\]

In this joint embedding, neighbor searches are performed in the subspace spanned by the first `D`

variables, while the last variable is to be predicted.

With this convention, `τ < 0`

implies "past/present values of source used to predict target", and `τ > 0`

implies "future/present values of source used to predict target". The latter case may not be meaningful for many applications, so by default, a warning will be given if `τ > 0`

(`embed_warn = false`

turns off warnings).

`CausalityTools.Pal`

— Type```
Pal <: <: DifferentialEntropyEstimator
Pal(; k = 1, w = 0, p = 1.0, n::Int = 10000)
```

A [`Shannon`

](@ref] and `Renyi`

differential entropy estimator (Pàl et al., 2010).

`w`

is the Theiler window, which determines if temporal neighbors are excluded during neighbor searches (defaults to `0`

, meaning that only the point itself is excluded when searching for neighbours).

**Description**

Pál et al. (2010)'s estimator is based on generalized nearest neighbor graphs. It is similar to several other kNN-based estimators (e.g. `LeonenkoProzantoSavani`

). Given samples $\bf{X}_{1:n} = \{\bf{X}_1, \bf{X}_2, \ldots, \bf{X}_n \}$ where $\bf{X}_1 \in \mathbb{R}^d$ from some distribution $\mu$ over $\mathbb{R}^d$ with density function $f$, approximates the `Renyi`

differential entropy

\[h_q^R(\bf(X)) = \dfrac{1}{1-q} \int_{\mathbb{R}^d} f^q(\bf{x}) d\bf{x}\]

using the estimator

\[\hat{H}_q^R(\bf{X_{1:n}}) = \dfrac{1}{1-q}\log \left( \dfrac{L_p(\bf{X}_{1:n})}{\gamma n^{1-p/d}} \right),\]

where $L_p(\bf{X}_{1:n}$ is the sum of the `p`

-th powers of the Euclidean lengths of the edges of the nearest neighbor graph over $\bf{X}_{1:n}$ (see their paper for details).

The constant $\gamma$ is determined by the limit given in equation 4 in Pàl et al. (2010), and is approximated on `n`

randomly generated points from the `d`

-dimensional unit cube, as they describe in the end of section 3 of their paper.

`CausalityTools.ParametricCopula`

— Type```
ParametricCopula <: MutualInformationEstimator
ParametricCopula(d = Normal())
```

A parametric copula-based mutual information estimator.

Robin et al. (2016) A statistical framework for neuroimaging data analysis based on mutual information estimated via a gaussian copula.

`CausalityTools.PartialCorrelation`

— Type`PartialCorrelation <: AssociationMeasure`

The correlation of two variables, with the effect of a set of conditioning variables removed.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for conditional dependence. - Use with
`partial_correlation`

to compute the raw correlation coefficient.

**Description**

There are several ways of estimating the partial correlation. We follow the matrix inversion method, because for `StateSpaceSet`

s, we can very efficiently compute the required joint covariance matrix $\Sigma$ for the random variables.

Formally, let $X_1, X_2, \ldots, X_n$ be a set of $n$ real-valued random variables. Consider the joint precision matrix,$P = (p_{ij}) = \Sigma^-1$. The partial correlation of any pair of variables $(X_i, X_j)$, given the remaining variables $\bf{Z} = \{X_k\}_{i=1, i \neq i, j}^n$, is defined as

\[\rho_{X_i X_j | \bf{Z}} = -\dfrac{p_ij}{\sqrt{ p_{ii} p_{jj} }}\]

In practice, we compute the estimate

\[\hat{\rho}_{X_i X_j | \bf{Z}} = -\dfrac{\hat{p}_ij}{\sqrt{ \hat{p}_{ii} \hat{p}_{jj} }},\]

where $\hat{P} = \hat{\Sigma}^{-1}$ is the sample precision matrix.

`CausalityTools.Parzen`

— Type```
Parzen <: MultivariateKernel
Parzen(h)
```

The Parzen kernel. For a given `d`

-dimensional StateSpaceSet `x`

and a query point `xᵢ`

, it computes the number of points within a hypercube of radius `h`

centered on `pᵢ`

.

To use it, do `p = Parzen(0.2); p(x, xᵢ)`

.

`CausalityTools.PearsonCorrelation`

— Type`PearsonCorrelation`

The Pearson correlation of two variables.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for pairwise dependence. - Use with
`pearson_correlation`

to compute the raw correlation coefficient.

**Description**

The sample Pearson correlation coefficient for real-valued random variables $X$ and $Y$ with associated samples $\{x_i\}_{i=1}^N$ and $\{y_i\}_{i=1}^N$ is defined as

\[\rho_{xy} = \dfrac{\sum_{i=1}^n (x_i - \bar{x})(y_i - \bar{y}) }{\sqrt{\sum_{i=1}^N (x_i - \bar{x})^2}\sqrt{\sum_{i=1}^N (y_i - \bar{y})^2}},\]

where $\bar{x}$ and $\bar{y}$ are the means of the observations $x_k$ and $y_k$, respectively.

`CausalityTools.Pecuzal`

— Type`Pecuzal(; dmax = 3, τmax = 30, w = 0)`

Used with `optimize_embedding`

to find the optimal embedding lags using the Pecuzal algorithm. `w`

is the Theiler window. `dmax`

is the maximum dimension and we query lags `1:τmax`

.

`CausalityTools.Peguin2`

— Type```
Peguin2 <: DiscreteDefinition
Peguin2(; xi = [0.5, 0.4], σ₁ = 0.1, σ₂ = 0.1,
p₁ = 0.7, p₂ = 0.1, p₃ = 0.4, p₄ = 2.4, p₅ = 0.9, p₆ = 4)
```

A 2D discrete autoregressive system with nonlinear, nontrivial coupling from [1] . This system is from Péguin-Feissolle & Teräsvirta (1999)[^Péguin-Feissolle1999], and was also studied in Chávez et al. (2003)^{[Chávez2003]}.

**Description**

The system is defined by the equations

\[\begin{align*} x(t+1) &= p_2 + p_3 x(t-2) + c_{yx}\dfrac{p_4 - p_5 y(t-3)}{1 + e^{-p_6 y(t-3)}} + \xi_1(t) \\ y(t+1) &= p_1 y(t) + \xi_2(t). \end{align*}\]

Here, $\xi_{1,2}(t)$ are two independent normally distributed noise processes with zero mean and standard deviations $\sigma_1$ and $\sigma_2$. The $\xi_{1,2}(t)$ terms represent dynamical noise. The parameters of the original system are here tweakable.

[^Péguin-Feissolle1999]: Péguin-Feissolle, A., & Teräsvirta, T. (1999). A General Framework for Testing the Granger Noncausaality Hypothesis. Universites d’Aix-Marseille II et III. https://www.amse-aixmarseille.fr/sites/default/files/_dt/greqam/99a42.pdf

`CausalityTools.Phase`

— Type`Phase <: InstantaneousSignalProperty`

Indicates that the instantaneous phases of a signal should be used.

`CausalityTools.PoczosSchneiderCMI`

— Type```
PoczosSchneiderCMI <: ConditionalMutualInformationEstimator
PoczosSchneiderCMI(k = 1, w = 0)
```

The `PoczosSchneiderCMI`

estimator computes various (differential) conditional mutual informations, using a `k`

-th nearest neighbor approach (Póczos & Schneider, 2012)^{[Póczos2012]}.

`CausalityTools.PredictiveAsymmetry`

— Type```
PredictiveAsymmetry <: AssociationMeasure
PredictiveAsymmetry(ηs = 1:15; normalize = false, f = 1.0,
dTf = 1, dT = 1, dS = 1, τT = -1, τS = -1, base = 2)
```

The predictive asymmetry measure Haaga2020.

This is a method that does not yet appear in a peer-reviewed scientific journal. Feel free to use, but consider it experimental for now.

`CausalityTools.PredictiveAsymmetryTest`

— Type```
PredictiveAsymmetryTest <: IndependenceTest
PredictiveAsymmetryTest(measure::PredictiveAsymmetryTest, est; f = 1.0)
```

An independence test based on the `PredictiveAsymmetry`

directional association measure. When used with `independence`

, a p-value is returned.

`CausalityTools.RMCD`

— Type```
RMCD <: AssociationMeasure
RMCD(; r, metric = Euclidean(), base = 2)
```

The recurrence measure of conditional dependence, or RMCD Ramos2017, is a recurrence-based measure that mimics the conditional mutual information, but uses recurrence probabilities.

`r`

is a mandatory keyword which specifies the recurrence threshold when constructing recurrence matrices. It can be instance of any subtype of `AbstractRecurrenceType`

from RecurrenceAnalysis.jl. To use any `r`

that is not a real number, you have to do `using RecurrenceAnalysis`

first. The `metric`

is any valid metric from Distances.jl.

Both the pairwise and conditional RMCD is non-negative, but due to round-off error, negative values may occur. If that happens, an RMCD value of `0.0`

is returned.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for pairwise or conditional association. - Use with
`rmcd`

to compute the raw RMCD for pairwise or conditional association.

**Description**

The RMCD measure is defined by

\[I_{RMCD}(X; Y | Z) = \dfrac{1}{N} \sum_{i} \left[ \dfrac{1}{N} \sum_{j} R_{ij}^{X, Y, Z} \log \left( \dfrac{\sum_{j} R_{ij}^{X, Y, Z} \sum_{j} R_{ij}^{Z} }{\sum_{j} \sum_{j} R_{ij}^{X, Z} \sum_{j} \sum_{j} R_{ij}^{Y, Z}} \right) \right],\]

where `base`

controls the base of the logarithm. $I_{RMCD}(X; Y | Z)$ is zero when $Z = X$, $Z = Y$ or when $X$, $Y$ and $Z$ are mutually independent.

Our implementation allows dropping the third/last argument, in which case the following mutual information-like quantitity is computed (not discussed in Ramos2017.

\[ I_{RMCD}(X; Y) = \dfrac{1}{N} \sum_{i} \left[ \dfrac{1}{N} \sum_{j} R_{ij}^{X, Y} \log \left( \dfrac{\sum_{j} R_{ij}^{X} R_{ij}^{Y} }{\sum_{j} R_{ij}^{X, Y}} \right) \right]\]

`CausalityTools.Rahimzamani`

— Type```
Rahimzamani <: ConditionalMutualInformationEstimator
Rahimzamani(k = 1, w = 0)
```

The `Rahimzamani`

estimator, short for Rahimzamani-Asnani-Viswanath-Kannan, is an estimator for Shannon conditional mutual information for data that can be mixtures of discrete and continuous data Rahimzamani2018.

This is very similar to the `GaoKannanOhViswanath`

mutual information estimator, but has been expanded to the conditional case.

`CausalityTools.RandomSegment`

— Type```
RandomSegment <: CrossmapEstimator
RandomSegment(; libsizes::Int, rng = Random.default_rng())
```

Indicatates that cross mapping is performed on contiguous time series segments/windows of length `L`

with a randomly selected starting point.

This is method 2 from Luo2015.

`CausalityTools.RandomVectors`

— Type```
RandomVectors <: CrossmapEstimator
RandomVectors(; libsizes, replace = false, rng = Random.default_rng())
```

Cross-map over `N`

different libraries, where `N = length(libsizes)`

, and the `i`

-th library has cardinality `k = libsizes[i]`

. Points within each library are randomly selected, independently of other libraries, and `replace`

controls whether or not to sample with replacement. A user-specified `rng`

may be specified for reproducibility.

This is method 3 from Luo2015.

See also: `CrossmapEstimator`

.

`CausalityTools.Repressilator6`

— Type```
Repressilator6 <: ContinuousDefinition
Repressilator6(; xi = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6], α = 10.0, α₀ = 0.0, β = 100.0,
n = 2) → ContinuousDynamicalSystem
```

A six-dimensional repressilator (or repression-driven oscillator) from Elowitz2000.

Used in Sun2014 to study the performance of the causation entropy algorithm.

**Description**

\[\begin{align*} \dfrac{dm_1}{dt} &= -m1 + \dfrac{\alpha}{1 + p_3^n} + \alpha_0 \\ \dfrac{dm_2}{dt} &= -m2 + \dfrac{\alpha}{1 + p_1^n} + \alpha_0 \\ \dfrac{dm_3}{dt} &= -m3 + \dfrac{\alpha}{1 + p_2^n} + \alpha_0 \\ \dfrac{dp_1}{dt} &= -\beta(p_1 - m_1) \\ \dfrac{dp_2}{dt} &= -\beta(p_2 - m_2) \\ \dfrac{dp_3}{dt} &= -\beta(p_3 - m_3) \\ \end{align*}\]

`CausalityTools.RosslerBidir6`

— Type```
RosslerBidir6 <: ContinuousDefinition
RosslerBidir6(; xi = [0.1, 0.1, 0.2, 0.3, 0.3, 0.4],
a = 0.1, b = 0.1, c = 14.0, ϵ₁ = 0.0, ϵ₂ = 0.0,
ω₁ = 1 + 0.015, ω₂ = 1 - 0.015)
```

A bidirectionally coupled 6D Rossler system from Krakovská et al. (2018)^{[Krakovská2018]}.

**Description**

The system consists of two separate subsystems, each being a 3D Rossler attractor. The subsystems are bidirectionally coupled, influencing each other through variables $x_1$ and ``x_2`

.

\[\begin{align*} \dfrac{dx_1}{dt} &= \omega_1 (-y_1) - z_1 + c_{21}*(x_1 - x_2) \\ \dfrac{dy_1}{dt} &= \omega_1 x_1 + a y_1 \\ \dfrac{dz_1}{dt} &= b + z_1 (x_1 - c) \\ \dfrac{dx_2}{dt} &= \omega_2 (-y_2) - z_2 + c_{12} (x_2 - x_1) \\ \dfrac{dy_2}{dt} &= \omega_2*x_2 + a*y_2 \\ \dfrac{dz_2}{dt} &= b + z_2 (x_2 - c) \\ \end{align*}\]

with $c_{12} \geq 0$ and $c_{21} \geq 0$.

`CausalityTools.RosslerForced9`

— Type```
RosslerForced9 <: ContinuousDefinition
RosslerForced9(; xi = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9],
ω₁ = 1.015, ω₂ = 0.985, ω₃ = 0.95,
c_xy = 0.1, c_yx = 0.1,
c_zx = 0.05, c_zy = 0.05,
a₁ = 0.15, a₂ = 0.2, a₃ = 10,
b₁ = 0.15, b₂ = 0.2, b₃ = 10,
c₁ = 0.15, c₂ = 0.2, c₃ = 10)
```

Equations of motion for a system consisting of three coupled 3D Rössler systems ($X$, $Y$, $Z$), giving a 9D system (Amigó & Hirata, 2018)^{[Amigó2018]}.

**Description**

The dynamics is generated by the following vector field

\[\begin{aligned} \dot{x_1} &= -\omega_1 (x_2 + x_3) + c_{yx}(y_1 - x_1) + c_{zx}(z_1 - x_1) \\ \dot{x_2} &= \omega_1 x_1 + a_1 x_2 \\ \dot{x_3} &= a_2 + x_3 (x_1 - a_3) \\ \dot{y_1} &= -\omega_1 (y_2 + y_3) + c_{xy}(x_1 - y_1) + c_{zy}(z_1 - y_1) \\ \dot{x_2} &= \omega_2 y_1 + b_1 y_2 \\ \dot{x_3} &= b_2 + x_3 (y_1 - b_3) \\ \dot{y_1} &= -\omega_2 (z_2 + z_3) \\ \dot{x_2} &= \omega_2 z_1 + c_1 z_2 \\ \dot{x_3} &= c_2 + z_3 (z_1 - c_3). \end{aligned}\]

The external system $Z$ influences both $X$ and $Y$ (controlled by `c_zx`

and `c_zy`

). Simultaneously, the subsystems $X$ and $Y$ bidirectionally influences each other (controlled by `c_xy`

and `c_yx`

). The $X$ and $Y$ subsystems are mostly synchronized for `c_xy > 0.1`

or `c_yx > 0.1`

.

`CausalityTools.RosslerLorenzUnidir6`

— Type```
RosslerLorenzUnidir6 <: ContinuousDefinition
RosslerLorenzUnidir6(; xi = [0.1, 0.2, 0.3, 0.05, 0.1, 0.15],
a₁ = 6, a₂ = 6, a₃ = 2.0, b₁ = 10, b₂ = 28, b₃ = 8/3, c_xy = 1.0)
```

Initialise a Rössler-Lorenz system consisting of two independent 3D subsystems: one Rössler system and one Lorenz system. They are coupled such that the second component (`x₂`

) of the Rössler system unidirectionally forces the second component (`y₂`

) of the Lorenz system.

The parameter `c_xy`

controls the coupling strength. The implementation here also allows for tuning the parameters of each subsystem by introducing the constants `a₁`

, `a₂`

, `a₃`

, `b₁`

, `b₂`

, `b₃`

. Default values for these parameters are as in [1].

**Equations of motion**

The dynamics is generated by the following vector field

**Description**

\[\begin{align*} \dot x_1 &= -a_1(x_2 + x_3) \\ \dot x_2 &= a_2(x_1 + a_2x_2) \\ \dot x_3 &= a_1(a_2 + x_3(x_1 - a_3)) \\ \dot y_1 &= b_1(y_2 - y_1) \\ \dot y_2 &= y_1(b_2 - y_3) - y_2 +c_{xy}(x_2)^2 \\ \dot y_3 &= y_1 y_2 - b_3y_3 \end{align*}\]

with the coupling constant $c_{xy} \geq 0$.

`CausalityTools.SMeasure`

— Type```
SMeasure < AssociationMeasure
SMeasure(; K::Int = 2, dx = 2, dy = 2, τx = - 1, τy = -1, w = 0)
```

`SMeasure`

is a bivariate association measure from Arnhold1999 and Quiroga2000 that measure directional dependence between two input (potentially multivariate) time series.

Note that `τx`

and `τy`

are negative; see explanation below.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for directional dependence. - Use with
`s_measure`

to compute the raw s-measure statistic.

**Description**

The steps of the algorithm are:

- From input time series $x(t)$ and $y(t)$, construct the delay embeddings (note the positive sign in the embedding lags; therefore inputs parameters
`τx`

and`τy`

are by convention negative).

\[\begin{align*} \{\bf{x}_i \} &= \{(x_i, x_{i+\tau_x}, \ldots, x_{i+(d_x - 1)\tau_x}) \} \\ \{\bf{y}_i \} &= \{(y_i, y_{i+\tau_y}, \ldots, y_{i+(d_y - 1)\tau_y}) \} \\ \end{align*}\]

Let $r_{i,j}$ and $s_{i,j}$ be the indices of the

`K`

-th nearest neighbors of $\bf{x}_i$ and $\bf{y}_i$, respectively. Neighbors closed than`w`

time indices are excluded during searches (i.e.`w`

is the Theiler window).Compute the the mean squared Euclidean distance to the $K$ nearest neighbors for each $x_i$, using the indices $r_{i, j}$.

\[R_i^{(k)}(x) = \dfrac{1}{k} \sum_{i=1}^{k}(\bf{x}_i, \bf{x}_{r_{i,j}})^2\]

- Compute the y-conditioned mean squared Euclidean distance to the $K$ nearest neighbors for each $x_i$, now using the indices $s_{i,j}$.

\[R_i^{(k)}(x|y) = \dfrac{1}{k} \sum_{i=1}^{k}(\bf{x}_i, \bf{x}_{s_{i,j}})^2\]

- Define the following measure of independence, where $0 \leq S \leq 1$, and low values indicate independence and values close to one occur for synchronized signals.

\[S^{(k)}(x|y) = \dfrac{1}{N} \sum_{i=1}^{N} \dfrac{R_i^{(k)}(x)}{R_i^{(k)}(x|y)}\]

**Input data**

The algorithm is slightly modified from Grassberger1999 to allow univariate timeseries as input.

- If
`x`

and`y`

are`StateSpaceSet`

s then use`x`

and`y`

as is and ignore the parameters`dx`

/`τx`

and`dy`

/`τy`

. - If
`x`

and`y`

are scalar time series, then create`dx`

and`dy`

dimensional embeddings, respectively, of both`x`

and`y`

, resulting in`N`

different`m`

-dimensional embedding points $X = \{x_1, x_2, \ldots, x_N \}$ and $Y = \{y_1, y_2, \ldots, y_N \}$.`τx`

and`τy`

control the embedding lags for`x`

and`y`

. - If
`x`

is a scalar-valued vector and`y`

is a`StateSpaceSet`

, or vice versa, then create an embedding of the scalar timeseries using parameters`dx`

/`τx`

or`dy`

/`τy`

.

In all three cases, input StateSpaceSets are length-matched by eliminating points at the end of the longest StateSpaceSet (after the embedding step, if relevant) before analysis.

`CausalityTools.Silverman`

— Type```
Silverman <: BandwidthRule
Silverman()
```

A rule of thumb giving the bandwidth $h$ for kernel density estimation of entropy following the rules outlined in this paper

`CausalityTools.SurrogateTest`

— Type```
SurrogateTest <: IndependenceTest
SurrogateTest(measure, [est];
nshuffles::Int = 100,
surrogate = RandomShuffle(),
rng = Random.default_rng(),
show_progress = false,
)
```

A generic (conditional) independence test for assessing whether two variables `X`

and `Y`

are independendent, potentially conditioned on a third variable `Z`

, based on surrogate data.

When used with `independence`

, a `SurrogateTestResult`

is returned.

**Description**

This is a generic one-sided hypothesis test that checks whether `x`

and `y`

are independent (given `z`

, if provided) based on resampling from a null distribution assumed to represent independence between the variables. The null distribution is generated by repeatedly shuffling the input data in some way that is intended to break any dependence between the input variables.

There are different ways of shuffling, dictated by `surrogate`

, each representing a distinct null hypothesis. For each shuffle, the provided `measure`

is computed (using `est`

, if relevant). This procedure is repeated `nshuffles`

times, and a test summary is returned. The shuffled variable is always the first variable (`X`

). Exceptions are:

- If
`TransferEntropy`

measure such as`TEShannon`

, then the source variable is always shuffled, and the target and conditional variable are left unshuffled.

**Compatible measures**

Measure | Pairwise | Conditional | Requires `est` |
---|---|---|---|

`PearsonCorrelation` | ✓ | ✖ | No |

`DistanceCorrelation` | ✓ | ✓ | No |

`SMeasure` | ✓ | ✖ | No |

`HMeasure` | ✓ | ✖ | No |

`MMeasure` | ✓ | ✖ | No |

`LMeasure` | ✓ | ✖ | No |

`PairwiseAsymmetricInference` | ✓ | ✖ | Yes |

`ConvergentCrossMapping` | ✓ | ✖ | Yes |

`MIShannon` | ✓ | ✖ | Yes |

`MIRenyiJizba` | ✓ | ✖ | Yes |

`MIRenyiSarbu` | ✓ | ✖ | Yes |

`MITsallisMartin` | ✓ | ✖ | Yes |

`MITsallisFuruichi` | ✓ | ✖ | Yes |

`PartialCorrelation` | ✖ | ✓ | Yes |

`CMIShannon` | ✖ | ✓ | Yes |

`CMIRenyiJizba` | ✖ | ✓ | Yes |

`TEShannon` | ✓ | ✓ | Yes |

`TERenyiJizba` | ✓ | ✓ | Yes |

`PMI` | ✖ | ✓ | Yes |

**Examples**

`CausalityTools.SurrogateTestResult`

— Type`SurrogateTestResult(m, m_surr, pvalue)`

Holds the result of a `SurrogateTest`

. `m`

is the measure computed on the original data. `m_surr`

is a vector of the measure computed on permuted data, where `m_surr[i]`

is the measure compute on the `i`

-th permutation. `pvalue`

is the one-sided `p`

-value for the test.

`CausalityTools.SymbolicTransferEntropy`

— Type```
SymbolicTransferEntropy <: TransferEntropyEstimator
SymbolicTransferEntropy(; m = 3, τ = 1, lt = ComplexityMeasures.isless_rand
```

A convenience estimator for symbolic transfer entropy Staniek2008.

**Description**

Symbolic transfer entropy consists of two simple steps. First, the input time series are embedded with embedding lag `m`

and delay `τ`

. The ordinal patterns of the embedding vectors are then encoded using `SymbolicPermutation`

with `marginal_encodings`

. This transforms the input time series into integer time series using `OrdinalPatternEncoding`

.

Transfer entropy is then estimated as usual on the encoded timeseries with `transferentropy`

and the `CountOccurrences`

naive frequency estimator.

`CausalityTools.SystemDefinition`

— Type`SystemDefinition`

The abstract type of all system definitions. Abstract subtypes are `DiscreteDefinition`

and `ContinuousSystem`

.

`CausalityTools.TERenyiJizba`

— Type`TERenyiJizba() <: TransferEntropy`

The Rényi transfer entropy from Jizba2012.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for pairwise and conditional dependence. - Use with
`transferentropy`

to compute the raw transfer entropy.

**Description**

The transfer entropy from source $S$ to target $T$, potentially conditioned on $C$ is defined as

\[\begin{align*} TE(S \to T) &:= I_q^{R_J}(T^+; S^- | T^-) \\ TE(S \to T | C) &:= I_q^{R_J}(T^+; S^- | T^-, C^-), \end{align*},\]

where $I_q^{R_J}(T^+; S^- | T^-)$ is Jizba et al. (2012)'s definition of conditional mutual information (`CMIRenyiJizba`

). The variables $T^+$, $T^-$, $S^-$ and $C^-$ are described in the docstring for `transferentropy`

.

**Compatible estimators**

Jizba's formulation of Renyi-type transfer entropy can currently be estimated using selected probabilities estimators and differential entropy estimators, which under the hood compute the transfer entropy as Jizba's formulation of Rényi conditional mutual information.

Estimator | Type | Principle | `TERenyiJizba` |
---|---|---|---|

`CountOccurrences` | `ProbabilitiesEstimator` | Frequencies | ✓ |

`ValueHistogram` | `ProbabilitiesEstimator` | Binning (histogram) | ✓ |

`LeonenkoProzantoSavani` | `DifferentialEntropyEstimator` | Nearest neighbors | ✓ |

`CausalityTools.TEShannon`

— Type```
TEShannon <: TransferEntropy
TEShannon(; base = 2; embedding = EmbeddingTE()) <: TransferEntropy
```

The Shannon-type transfer entropy measure.

**Usage**

- Use with
`independence`

to perform a formal hypothesis test for pairwise and conditional dependence. - Use with
`transferentropy`

to compute the raw transfer entropy.

**Description**

The transfer entropy from source $S$ to target $T$, potentially conditioned on $C$ is defined as

\[\begin{align*} TE(S \to T) &:= I^S(T^+; S^- | T^-) \\ TE(S \to T | C) &:= I^S(T^+; S^- | T^-, C^-) \end{align*}\]

where $I(T^+; S^- | T^-)$ is the Shannon conditional mutual information (`CMIShannon`

). The variables $T^+$, $T^-$, $S^-$ and $C^-$ are described in the docstring for `transferentropy`

.

**Compatible estimators**

Shannon-type transfer entropy can be estimated using a range of different estimators, which all boil down to computing conditional mutual information, except for `TransferEntropyEstimator`

, which compute transfer entropy using some direct method.

`CausalityTools.TEVars`

— Type```
TEVars(Tf::Vector{Int}, T::Vector{Int}, S::Vector{Int})
TEVars(Tf::Vector{Int}, T::Vector{Int}, S::Vector{Int}, C::Vector{Int})
TEVars(;Tf = Int[], T = Int[], S = Int[], C = Int[]) -> TEVars
```

Which axes of the state space correspond to the future of the target (`Tf`

), the present/past of the target (`T`

), the present/past of the source (`S`

), and the present/past of any conditioned variables (`C`

)? This information is used by the transfer entropy estimators to ensure that marginal distributions are computed correctly.

Indices correspond to variables of the embedding, or, equivalently, colums of a `StateSpaceSet`

.

- The three-argument constructor assumes there will be no conditional variables.
- The four-argument constructor assumes there will be conditional variables.

`CausalityTools.Thomas3`

— Type```
Thomas3 <: ContinuousDefinition
Thomas3(; xi = [0.11, 0.09, 0.10], b = 0.20)
```

Thomas' cyclically symmetric attractor is a continuous dynamical system with three variables. It has a single free parameter `b`

, for which interesting behaviour occurs when `b ∈ (0, 1)`

. In particular, the system is chaotic whenever `b < 0.20`

.

**Definition**

\[\begin{align*} \dfrac{dx}{dt} &= sin(y) - bx \\ \dfrac{dy}{dt} &= sin(z) - by \\ \dfrac{dz}{dt} &= sin(x) - bz \end{align*}\]

`CausalityTools.TransferEntropy`

— Type`TransferEntropy <: AssociationMeasure`

The supertype of all transfer entropy measures. Concrete subtypes are

`CausalityTools.TransferEntropyEstimator`

— TypeThe supertype of all dedicated transfer entropy estimators.

`CausalityTools.UlamLattice`

— Type```
UlamLattice <: DiscreteDefinition
UlamLattice(; D::Int = 10; ui = sin.(1:10), ε::Real = 0.10)
```

A lattice of `D`

unidirectionally coupled ulam maps Schreiber2000 defined as

\[x^{m}_{t+1} = f(\epsilon x^{m-1}_{t} + (1 - \epsilon) x_{t}^{m}),\]

where $m = 1, 2, \ldots, D$ and $f(x) = 2 - x^2$. In this system, information transfer happens only in the direction of increasing $m$.

`CausalityTools.Var1`

— Type```
Var1 <: DiscreteDefinition
Var1(; xi = [0.5, 0.5, 0.5],
a = 0.5, θ = Normal(0, 1.0), η = Normal(0, 0.2), ϵ = Normal(0, 0.3),
rng = Random.default_rng())
```

A discrete vector autoregressive system where X₁ → X₂ → X₃.

`CausalityTools.Verdes3`

— Type```
Verdes3 <: DiscreteDefinition
Verdes3(; ui = [0.1, 0.15, 0.2], ωy = 315, ωz = 80, σx = 0.0, σy = 0.0, σz = 0.0)
```

A 3D system where the response X is a highly nonlinear combination of Y and Z Verdes2005. The forcings Y and Z involve sines and cosines, and have different periods, which controlled by `ωy`

and `ωz`

.

\[\begin{align*} x(t+1) &= \dfrac{y(t)(18y(t) - 27y(t)^2 + 10)}{2} + z(t)(1-z(t)) + ηx \\ y(t+1) &= \dfrac{(1 - \dfrac{\cos(2\pi)}{\omega y}t)}{2} + ηy \\ z(t+1) &= \dfrac{(1 - \dfrac{\sin(2\pi)}{\omega z}t)}{2} + ηz \end{align*}\]

where ηx, ηy, ηz is gaussian noise with mean 0 and standard deviation `σx`

, `σy`

and `σz`

.

`CausalityTools.Zhu1`

— Type```
Zhu1 <: TransferEntropyEstimator
Zhu1(k = 1, w = 0, base = MathConstants.e)
```

The `Zhu1`

transfer entropy estimator Zhu2015.

Assumes that the input data have been normalized as described in Zhu2015. The estimator can be used both for pairwise and conditional transfer entropy.

**Description**

This estimator approximates probabilities within hyperrectangles surrounding each point `xᵢ ∈ x`

using using `k`

nearest neighbor searches. However, it also considers the number of neighbors falling on the borders of these hyperrectangles. This estimator is an extension to the entropy estimator in Singh2003.

`w`

is the Theiler window, which determines if temporal neighbors are excluded during neighbor searches (defaults to `0`

, meaning that only the point itself is excluded when searching for neighbours).

`CausalityTools._convert_logunit`

— Method`_convert_logunit(h_a::Real, , to) → h_b`

Convert a number `h_a`

computed with logarithms to base `a`

to an entropy `h_b`

computed with logarithms to base `b`

. This can be used to convert the "unit" of an entropy.

`CausalityTools.anishchenko1`

— Method`anishchenko1(;u₀ = rand(2), α =3.277, s=0.1, ω=0.5*(sqrt(5)-1)) → DiscreteDynamicalSystem`

Initialise the system defined by eq. 13 in Anishchenko1998, which can give strange, nonchaotic attractors.

**Equations of motion**

\[\begin{aligned} dx &= \alpha (1-s \cos (2 \pi \phi )) \cdot x(1-x) \\ dϕ &= (\phi + \omega ) \mod{1} \end{aligned}\]

`CausalityTools.ar1_bidir`

— Method```
ar1_bidir(;u₀ = rand(2), a₁ = 0.5, b₁ = 0.7, c_xy = 0.1, c_yx = 0.2,
σx = 0.3, σy = 0.3) → DiscreteDynamicalSystem
```

A system consisting of two mutually coupled first order autoregressive processes.

**Equations of motion**

\[\begin{aligned} x(t+1) &= a_{1}x + c_{yx}y + \epsilon_{x} \\ y(t+1) &= b_{1}y + c_{xy}x + \epsilon_{y} \end{aligned}\]

where at each time step, $\epsilon_{x}$ and $\epsilon_{y}$ are drawn from independent normal distributions with zero mean and standard deviations `σx`

and `σy`

, respectively.

`CausalityTools.ar1_unidir`

— Method```
ar1_unidir(u₀, a₁ = 0.90693, b₁ = 0.40693, c_xy = 0.5,
σ = 0.40662) → DiscreteDynamicalSystem
```

A bivariate, order one autoregressive model, where $x \to y$ (Paluš et al, 2018)^{[Paluš2018]}.

**Equations of motion**

\[\begin{aligned} x(t+1) &= a_1 x(t) + \xi_{1} \\ y(t+1) &= b_1 y(t) - c_{xy} x + \xi_{2}, \end{aligned}\]

where $\xi_{1}$ and $\xi_{2}$ are drawn from normal distributions with zero mean and standard deviation `σ`

at each iteration.

`CausalityTools.asymmetry`

— Method`CausalityTools.backwards_eliminate!`

— Method`backwards_eliminate!(alg::OCE, parents::OCESelectedParents, x, i; verbose)`

Algorithm 2.2 in Sun et al. (2015). Perform backward elimination for the `i`

-th variable in `x`

, given the previously inferred `parents`

, which were deduced using parameters in `alg`

. Modifies `parents`

in-place.

`CausalityTools.bandwidth`

— Function`CausalityTools.condmutualinfo`

— Method`condmutualinfo([measure::CMI], est::CMIEstimator, x, y, z) → cmi::Real`

Estimate a conditional mutual information (CMI) of some kind (specified by `measure`

), between `x`

and `y`

, given `z`

, using the given dedicated `ConditionalMutualInformationEstimator`

, which may be discrete, continuous or mixed.

**Estimators**

Estimator | Principle | `CMIShannon` | `CMIRenyiPoczos` |
---|---|---|---|

`FPVP` | Nearest neighbors | ✓ | x |

`MesnerShalizi` | Nearest neighbors | ✓ | x |

`Rahimzamani` | Nearest neighbors | ✓ | x |

`PoczosSchneiderCMI` | Nearest neighbors | x | ✓ |

`CausalityTools.condmutualinfo`

— Method`condmutualinfo([measure::CMI], est::DifferentialEntropyEstimator, x, y, z) → cmi::Real`

Estimate the mutual information between `x`

and `y`

conditioned on `z`

, using the differential version of the given conditional mutual information (CMI) `measure`

. The `DifferentialEntropyEstimator`

`est`

must must support multivariate data. No bias correction is performed. If `measure`

is not given, then the default is `CMIShannon()`

.

`DifferentialEntropyEstimator`

s have their own `base`

field which is not used here. Instead, this method creates a copy of `est`

internally, where `est.base`

is replaced by `measure.e.base`

. Therefore, use `measure`

to control the "unit" of the mutual information.

**Estimators**

Estimator | Principle | `CMIShannon` |
---|---|---|

`Kraskov` | Nearest neighbors | ✓ |

`Zhu` | Nearest neighbors | ✓ |

`Gao` | Nearest neighbors | ✓ |

`Goria` | Nearest neighbors | ✓ |

`Lord` | Nearest neighbors | ✓ |

`LeonenkoProzantoSavani` | Nearest neighbors | ✓ |

`CausalityTools.condmutualinfo`

— Method`condmutualinfo([measure::CMI], est::MutualInformationEstimator, x, y, z) → cmi::Real`

Estimate the mutual information between `x`

and `y`

conditioned on `z`

, using the given conditional mutual information (CMI) `measure`

, computed as a a difference of mutual information terms (just the chain rule of mutual information)

\[\hat{I}(X; Y | Z) = \hat{I}(X; Y, Z) - \hat{I}(X; Z).\]

The `MutualInformationEstimator`

`est`

may be continuous/differential, discrete or mixed. No bias correction in performed, except the bias correction that occurs for each individual mutual information term. If `measure`

is not given, then the default is `CMIShannon()`

.

**Estimators**

Estimator | Type | Principle | `CMIShannon` |
---|---|---|---|

`KraskovStögbauerGrassberger1` | Continuous | Nearest neighbors | ✓ |

`KraskovStögbauerGrassberger2` | Continuous | Nearest neighbors | ✓ |

`GaoKannanOhViswanath` | Mixed | Nearest neighbors | ✓ |

`GaoOhViswanath` | Continuous | Nearest neighbors | ✓ |

`CausalityTools.condmutualinfo`

— Method`condmutualinfo([measure::CMI], est::ProbabilitiesEstimator, x, y, z) → cmi::Real ∈ [0, a)`

Estimate the conditional mutual information (CMI) `measure`

between `x`

and `y`

given `z`

using a sum of entropy terms, without any bias correction, using the provided `ProbabilitiesEstimator`

`est`

. If `measure`

is not given, then the default is `CMIShannon()`

.

With a `ProbabilitiesEstimator`

, the returned `cmi`

is guaranteed to be non-negative.

**Estimators**

Estimator | Principle | `CMIShannon` | `CMIRenyiSarbu` |
---|---|---|---|

`CountOccurrences` | Frequencies | ✓ | ✓ |

`ValueHistogram` | Binning (histogram) | ✓ | ✓ |

`SymbolicPermutation` | Ordinal patterns | ✓ | ✓ |

`Dispersion` | Dispersion patterns | ✓ | ✓ |

`CausalityTools.contingency_matrix`

— Function```
contingency_matrix(x, y, [z, ...]) → c::ContingencyMatrix
contingency_matrix(est::ProbabilitiesEstimator, x, y, [z, ...]) → c::ContingencyMatrix
```

Estimate a multidimensional contingency matrix `c`

from input data `x, y, …`

, where the input data can be of any and different types, as long as `length(x) == length(y) == …`

.

For already discretized data, use the first method. For continuous data, you want to discretize the data before computing the contingency table. You can do this manually and then use the first method. Alternatively, you can provide a `ProbabilitiesEstimator`

as the first argument to the constructor. Then the input variables `x, y, …`

are discretized *separately* according to `est`

(*enforcing the same outcome space for all variables*), by calling `marginal_encodings`

.

`CausalityTools.cpdag`

— Method`cpdag(alg::PC, skeleton::SimpleDiGraph, separating_sets::Dict{Edge, Vector{Int}}) → dg::SimpleDiGraph`

Orient edges in the `skeleton`

graph using the given `separating_sets`

using algorithm 2 in Kalisch2008 and return the directed graph `cpdag`

.

**Description**

First, a directed graph `dg`

is constructed from `skeleton`

, such that every undirected edge `X - Y`

in `skeleton`

is replaced by the bidirectional edge `X ↔ Y`

in `dg`

. In practices, for each `X ↔ Y`

, we construct two directional edges `X → Y`

and `Y → X`

.

Orientiation rules 0-3 are then applied to `dg`

. We use the rules as stated in Colombo & Maathuis, 2014.

- Rule 0 (orients v-structures):
`X ↔ Y ↔ Z`

becomes`X → Y ← Z`

if`Y`

is not in the separating set`S(X, Z)`

. - Rule 1 (prevents new v-structures):
`X → Y ↔ Z`

becomes`X → Y → Z`

if`X`

and`Z`

are not adjacent. - Rule 2 (avoids cycles):
`X → Y → Z ↔ X`

becomes X → Y → Z ← X` - Rule 3: To avoid creating cycles or new v-structures, X X | | | | | | Y | W becomes Y | W ↘ | ↙ ↘ ↓ ↙ Z Z

`CausalityTools.crossmap`

— Function```
crossmap(measure::CrossmapMeasure, t::AbstractVector, s::AbstractVector) → ρ::Real
crossmap(measure::CrossmapMeasure, est, t::AbstractVector, s::AbstractVector) → ρ::Vector
crossmap(measure::CrossmapMeasure, t̄::AbstractVector, S̄::AbstractStateSpaceSet) → ρ
```

Compute the cross map estimates between between raw time series `t`

and `s`

(and return the real-valued cross-map statistic `ρ`

). If a `CrossmapEstimator`

`est`

is provided, cross mapping is done on random subsamples of the data, where subsampling is dictated by `est`

(a vector of values for `ρ`

is returned).

Alternatively, cross-map between time-aligned time series `t̄`

and source embedding `S̄`

that have been constructed by jointly (pre-embedding) some input data.

This is just a wrapper around `predict`

that simply returns the correspondence measure between the source and the target.

`CausalityTools.crossmap`

— Method```
crossmap(x, y, d, τ; r = 0, correspondence_measure = Statistics.cor) → Float64
crossmap(x, y, d, τ, bootstrap_method::Symbol; r = 0, correspondence_measure = Statistics.cor,
method = :segment, L = ceil(Int, (length(x)-d*τ)*0.2), nreps = 100) → Vector{Float64}
```

This syntax is deprecated. It will continue to work for CausalityTools v1.X, but will be removed in CausalityTools v2. See here for updated syntax.

Compute the cross mapping Sugihara2012 between `x`

and `y`

, which is the correspondence (computed using `correspondence measure`

) between the values $y(t)$ and the cross-map estimated values $ỹ(t) | M_x$.

Returns the correspondence between original and cross mapped values (the default is `ρ = correspondence_measure(y(t), ỹ(t) | M_x)`

).

Here, $y(t)$ are the raw values of the time series `y`

, and $ỹ(t)$ are the predicted values computed from the out-of-sample embedding $M_X$ constructed from the time series `x`

with embedding dimension `d`

and embedding lag `τ`

.

The Theiler window `r`

indicates how many temporal neighbors of the predictee is to be excluded during the nearest neighbors search (the default `r = 0`

excludes only the predictee itself, while `r = 2`

excludes the point itself plus its two nearest neighbors in time).

If `bootstrap_method`

is specified, then `nreps`

different bootstrapped estimates of `correspondence_measure(y(t), ỹ(t) | M_x)`

are returned. The following bootstrap methods are available:

`bootstrap_method = :random`

selects training sets of length`L`

consisting of randomly selected points from the embedding $M_x$ (time ordering does not matter). This is method 3 from Luo2015, which critiqued the original Sugihara2012's methodology.`bootstrap_method = :segment`

selects training sets consisting of time-contiguous segments (each of lenght`L`

) of embedding vectors in $M_x$ (time ordering matters). This is method 2 from Luo2015.

`CausalityTools.densities_at_points`

— Method`densities_at_points(kernel::MultivariateKernel, x::AbstractStateSpaceSet, bandwidth)`

Compute the local densities at each point `xᵢ ∈ x`

using the given multivariate `kernel`

and `bandwidth`

.

`CausalityTools.distance_correlation`

— Method```
distance_correlation(x, y) → dcor ∈ [0, 1]
distance_correlation(x, y, z) → pdcor
```

Compute the empirical/sample distance correlation (Székely et al., 2007)^{[Székely2007]}, here called `dcor`

, between StateSpaceSets `x`

and `y`

. Alternatively, compute the partial distance correlation `pdcor`

(Székely and Rizzo, 2014)^{[Székely2014]}.

See also: `DistanceCorrelation`

.

`CausalityTools.distance_covariance`

— Method```
distance_covariance(x, y) → dcov::Real
distance_covariance(x, y, z) → pdcov::Real
```

Compute the empirical/sample distance covariance (Székely et al., 2007)^{[Székely2007]} between StateSpaceSets `x`

and `y`

. Alternatively, compute the partial distance covariance `pdcov`

.

`CausalityTools.distance_variance`

— Method`distance_variance(x) → dvar::Real`

Compute the empirical/sample distance variance (Székely et al., 2007)^{[Székely2007]} for StateSpaceSet `x`

.

`CausalityTools.eliminate_loop!`

— Method`eliminate_loop!(alg::OCE, parents::OCESelectedParents, xᵢ; verbose = false)`

Inner portion of algorithm 2.2 in Sun et al. (2015). This method is called in an external while-loop that handles the variable elimination step in their line 3.

`CausalityTools.empcdf`

— Method```
empirical_cdf(x::AbstractVector{<:Real}) → x̄::Vector
empirical_cdf(x::AbstractStateSpaceSet) → x̄::StateSpaceSet
```

Rank each sample `xᵢ ∈ x`

, rescale it the rank to the interval `[0, 1]`

and return the rescaled ranks `x̄`

.

`CausalityTools.empirical_copula_transformation`

— Method```
empirical_copula_transformation(x::AbstractVector) → empirical_copula::Vector{<:Real}
empirical_copula_transformation(x::AbstractStateSpaceSet{D, T}) → empirical_copula::StateSpaceSet{D, T}
```

Apply the empirical copula transformation (as described in Pal2010; see a summary below) to the each point `xᵢ ∈ x`

, where the `xᵢ`

can be either univariate (`x::AbstractVector`

) or multivariate (`x::AbstractStateSpaceSet`

) to compute the empirical copula (here called `empirical_copula)`

.

**Description**

**Empirical copula transformation**

Assume we have a length-`n`

sample of data points $\bf{X}_{1:n} = \{\bf{X}_1, \bf{X}_2, \ldots, \bf{X}_n \}$ where $\bf{X}_i \in \mathbb{R}^d$, which is assumed sampled from some distribution $\mu$ with density function $f$. Let $X_i^j \in \mathbb{R}$ denote the j-th coordinate of $\bf{X}_i$. Assume these points are represented as the `d`

-dimensional `StateSpaceSet`

which we call `S`

(indexed like a matrix where rows are samples).

The *empirical* cumulative distribution function (CDF) for the j-th column of `S`

, based on the sample $\bf{X}_{1:n}$, is defined as

\[\hat{F}_j(y) = \dfrac{\left| \{ 1 \leq i \leq n, y \leq X^j_i \} \right|}{n},\]

for any input value $y \in \mathbb{R}$ (which is in general completely unrelated to the j-th column of our sample points). Given the samples $\bf{X}_{1:n}$, we can also define a "multivariate empirical CDF" for `S`

, $\bf{\hat{F}} : \mathbb{R}^d \to [0, 1]^d$, as

\[\hat{\bf{F}}(\bf{y}) = (\hat{F}_j(x^1), \hat{F}_j(x^2), \ldots, \hat{F}_j(x^d)),\]

for any point $\bf{y} \in \mathbb{R}^d$ (which is in general completely unrelated to our sample points, except sharing the property of being `d`

-dimensional). Think of this as checking, for each coordinate $y^j \in \bf{y}$, how this coordinate ranks among values in `S[:, j]`

. The map $\hat{\bf{F}}$ is called the *empirical copula transformation*.

Sidenote: Given only our sample, don't actually *know* what the underlying distribution $\mu$ is, nor what its cumulative distribution function $F$ is. But if we did, the analogous map (the *copula transformation*) $\bf{F} : \mathbb{R}^d \to [0, 1]^d$ would be

\[\bf{F}(\bf{y}) = (F_j(x^1), F_j(x^2), \ldots, F_j(x^d)).\]

In summary, we've defined the empirical copula *transformation* $\hat{\bf{F}}$ as a map from some `d`

-dimensional space to the `d`

-dimensional unit square. The j-th axis of $\hat{\bf{F}}$'s domain and the j-th axis of $\hat{\bf{F}}$'s codomain (i.e. the hypersquare) are linked through the *ranks* of `S[:, j]`

.

**Empirical copula**

The *copula* of $\mu$ is the joint distribution $\bf{F}(\bf{X}) = (F_1(X^1), F_2(X^2), \ldots, F_d(X^d))$. The *empirical copula* (note the lack of "transformation" here) is the set of `d`

-dimensional empirical-copula-transformed points $\hat{\bf{Z}} = \{\bf{Z}_1, \bf{Z}_2, \ldots, \bf{Z}_n \} = \{ \hat{\bf{F}}(\bf{X_1}), \hat{\bf{F}}(\bf{X_2}), \ldots, \hat{\bf{F}}(\bf{X_n}) \}$. Note that $\hat{\bf{Z}}$ is an *approximation* of a sample $\{\bf{Z}_1,\bf{Z}_2, \ldots, \bf{Z}_n\} = \{\bf{F}(\bf{X}_1), \bf{F}(\bf{X}_2), \ldots, \bf{F}(\bf{X}_n)\}$ from the true copula of $\mu$ (which we in general don't know, given only some sample points).

`CausalityTools.entropy_conditional`

— Method`entropy_conditional(measure::ConditionalEntropy, c::ContingencyMatrix{T, 2}) where T`

Estimate the discrete version of the given `ConditionalEntropy`

`measure`

from its direct (sum) definition, using the probabilities from a pre-computed `ContingencyMatrix`

, constructed from two input variables `x`

and `y`

. This estimation method works for both numerical and categorical data. If `measure`

is not given, then the default is `CEShannon()`

.

The convention is to compute the entropy of the variable in the *first* column of `c`

conditioned on the variable in the *second* column of `c`

. To do the opposite, call this function with a new contingency matrix where the order of the variables is reversed.

**Compatible measures**

`ContingencyMatrix` | |
---|---|

`CEShannon` | ✓ |

`CETsallisFuruichi` | ✓ |

`CETsallisAbe` | ✓ |

`CausalityTools.entropy_conditional`

— Method`entropy_conditional([measure::ConditionalEntropy], est::DifferentialEntropyEstimator, x, y)`

Estimate the entropy of `x`

conditioned on `y`

, using the differential/continuous version of the given conditional entropy (CE) `measure`

. The CE is computed the difference of the joint entropy and the marginal entropy of `y`

, using the `DifferentialEntropyEstimator`

`est`

, which must be compatible with multivariate data. No bias correction is applied. If `measure`

is not given, then the default is `CEShannon()`

.

**Estimators**

Estimator | Principle | `CEShannon` | `CETsallisAbe` | `CETsallisFuruichi` |
---|---|---|---|---|

`Kraskov` | Nearest neighbors | ✓ | x | x |

`Zhu` | Nearest neighbors | ✓ | x | x |

`ZhuSingh` | Nearest neighbors | ✓ | x | x |

`Gao` | Nearest neighbors | ✓ | x | x |

`Goria` | Nearest neighbors | ✓ | x | x |

`Lord` | Nearest neighbors | ✓ | x | x |

`LeonenkoProzantoSavani` | Nearest neighbors | ✓ | x | x |

`CausalityTools.entropy_conditional`

— Method`entropy_conditional([measure::ConditionalEntropy], est::ProbabilitiesEstimator, x, y)`

Estimate the entropy of `x`

conditioned on `y`

, using the discrete version of the given conditional entropy (CE) `measure`

. The CE is computed the difference of the joint entropy and the marginal entropy of `y`

, using the `ProbabilitiesEstimator`

`est`

, which must compatible with multivariate data (that is, have an implementation for `marginal_encodings`

). No bias correction is applied. If `measure`

is not given, then the default is `CEShannon()`

.

**Estimators**

Joint and marginal probabilities are computed by jointly discretizing `x`

and `y`

using the approach given by `est`

, and obtaining the marginal distribution for `y`

from the joint distribution.

Estimator | Principle | `CEShannon` | `CETsallisAbe` | `CETsallisFuruichi` |
---|---|---|---|---|

`CountOccurrences` | Frequencies | ✓ | ✓ | x |

`ValueHistogram` | Binning (histogram) | ✓ | ✓ | x |

`SymbolicPermutation` | Ordinal patterns | ✓ | ✓ | x |

`Dispersion` | Dispersion patterns | ✓ | ✓ | x |

`CausalityTools.entropy_joint`

— Method```
entropy_joint(e::EntropyDefinition, x, y)
entropy_joint(e::EntropyDefinition, c::ContingencyMatrix)
```

Compute the joint entropy of type `e`

(e.g. `Shannon`

) of the input variables `x`

and `y`

, or from the pre-computed contingency matrix `c`

(see `ContingencyMatrix`

).

**Discrete definitions**

Given two two discrete random variables $X$ and $Y$ with ranges $\mathcal{X}$ and $\mathcal{X}$, we define the following discrete joint entropies:

`JointEntropyShannon`

: $H^S(X, Y) = -\sum_{x\in \mathcal{X}, y \in \mathcal{Y}} p(x, y) \log p(x, y)$ (Cover & Thomas, 2006)CoverThomas2006.`JointEntropyRenyi`

: $H_q^R(X, Y) = -\dfrac{1}{1-\alpha} \log \sum_{i = 1}^N p_i^q$ (Golshani et al., 2009)Golshani2009.`JointEntropyTsallis`

: $H_q^T(X, Y) = -\sum_{x\in \mathcal{X}, y \in \mathcal{Y}} p(x, y)^q \log_q p(x, y)$ (Furuichi, 2006)Furuichi2006, where $log_q(x, q) = \dfrac{x^{1-q} - 1}{1-q}$ is the q-logarithm.

The expressions for Shannon joint entropy is from CoverThomas2006, for Rényi joint entropy from Golshani2009, and for Tsallis joint entropy from Furuichi2006.

`CausalityTools.eom_logistic2_bidir`

— Method`logistic2_bidir(u₀, c_xy, c_yx, r₁, r₂, σ_xy, σ_yx)`

Equations of motion for a bidirectional logistic model for the chaotic population dynamics of two interacting species. This system is from [1], and is given by

\[\begin{align} x(t+1) &= r_1 f_{yx}^{t}(1 - f_{yx}^{t}) \ y(t+1) &= r_2 f_{xy}^{t}(1 - f_{xy}^{t}) \ f_{xy}^t &= \dfrac{y(t) + c_{xy}(x(t) + \sigma_{xy} \xi_{xy}^t )}{1 + c_{xy} (1 + \sigma_{xy} )} \ f_{yx}^t &= \dfrac{x(t) + c_{yx}(y(t) + \sigma_{yx} \xi_{yx}^t )}{1 + c_{yx} (1 + \sigma_{yx} )}, \end{align}\]

where the coupling strength $c_{xy}$ controls how strongly species $x$ influences species $y$, and vice versa for $c_{yx}$. To simulate time-varying influence of unobserved processes, we use the dynamical noise terms $\xi_{xy}^t$ and $\xi_{yx}^t$, drawn from a uniform distribution with support on $[0, 1]$. If $\sigma_{xy} > 0$, then the influence of $x$ on $y$ is masked by dynamical noise equivalent to $\sigma_{xy} \xi_{xy}^{t}$ at the $t$-th iteration of the map, and vice versa for $\sigma_{yx}$.

**References**

- Diego, David, Kristian Agasøster Haaga, and Bjarte Hannisdal. "Transfer entropy computation using the Perron-Frobenius operator." Physical Review E 99.4 (2019): 042212.

`CausalityTools.eom_logistic3`

— Method`eom_logistic3(u, p, t)`

Equations of motion for a system consisting of three coupled logistic map representing the response of two independent dynamical variables to the forcing from a common driver. The dynamical influence goes in the directions Z → X and Z → Y.

**Equations of motion**

The equations of motion are

\[\begin{aligned} x(t+1) = (x(t)(r - r_1 x(t) - z(t) + σ_x η_x)) \mod 1 \\ y(t+1) = (y(t)(r - r_2 y(t) - z(t) + σ_y η_y)) \mod 1 \\ z(t+1) = (z(t)(r - r_3 z(t) + σ_z η_z)) \mod 1 \end{aligned}\]

Dynamical noise may be added to each of the dynamical variables by tuning the parameters `σz`

, `σx`

and `σz`

. Default values for the parameters `r₁`

, `r₂`

and `r₃`

are set such that the system exhibits chaotic behaviour, with `r₁ = r₂ = r₃ = 4`

.

**References**

- Runge, Jakob. Causal network reconstruction from time series: From theoretical assumptions to practical estimation, Chaos 28, 075310 (2018); doi: 10.1063/1.5025050

`CausalityTools.eom_nonlinear3d`

— Method```
eom_nonlinear3d(u₀, a₁, a₂, a₃, b₁, b₂, b₃,
c₁₂, c₂₃, c₁₃, σ₁, σ₂, σ₃) → DiscreteDynamicalSystem
```

Equations of motion for a 3d nonlinear system with nonlinear couplings $x_1 \to x_2$, $x_2 \to x_3$ and $x_1 \to x_3$. Modified from [1].

**Equations of motion**

The equations of motion are

\[\begin{aligned} x_1(t+1) &= a_1 x_1 (1-x_1(t))^2 e^{-x_2(t)^2} + 0.4 \xi_{1}(t) \\ x_2(t+1) &= a_1 x_2 (1-x_2(t))^2 e^{-x_2(t)^2} + 0.4 \xi_{2}(t) + b x_1 x_2 \\ x_3(t+1) &= a_3 x_3 (1-x_3(t))^2 e^{-x_3(t)^2} + 0.4 \xi_{3}(t) + c x_{2}(t) + d x_{1}(t)^2. \end{aligned}\]

**References**

- Gourévitch, B., Le Bouquin-Jeannès, R., & Faucon, G. (2006). Linear and nonlinear causality between signals: methods, examples and neurophysiological applications. Biological Cybernetics, 95(4), 349–369.

`CausalityTools.estimate`

— Method`estimate(e::EntropyDefinition, est::InformationMeasureEstimator, input::VectorOrStateSpaceSet...)`

Given some `input`

data, estimate some information measure using the given `InformationMeasureEstimator`

, with respect to the generalized entropy `e`

.

`CausalityTools.fishers_z`

— Method`fishers_z(p̂)`

Compute Fisher's z-transform on the sample partial correlation coefficient `p̂`

(computed as the correlation between variables `i`

and `j`

, given the remaining variables):

\[Z(V_i, V_j | \bf{S}) = \dfrac{1}{2} \log{\left( \dfrac{1 + \hat{p}(V_i, V_j | \bf{S})}{1 - \hat{p}(V_i, V_j | \bf{S})} \right) }\]

`CausalityTools.h_measure`

— Method`h_measure(measure::HMeasure, x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet)`

Compute the `HMeasure`

from source `x`

to target `y`

.

`CausalityTools.henon2`

— Method`henon2(;u₀ = [0.1, 0.2, 0.2, 0.3], c_xy = 2.0) → DiscreteDynamicalSystem`

A bivariate system consisting of two identical 1D Henon maps with unidirectional forcing $X \to Y$ (Krakovská et al., 2018)^{[Krakovská2018]}.

**Equations of motion**

The equations of motion are

\[\begin{aligned} x_1(t+1) &= 1.4 - x_1^2(t) + 0.3x_2(t) \\ x_2(t+1) &= x_1(t) \\ y_1(t+1) &= 1.4 - [c_{xy} x_1(t) y_1(t) + (1-c_{xy}) y_1^2(t)] + 0.3 y_2(t) \\ y_2(t+1) &= y_1(t) \end{aligned}\]

`CausalityTools.henon_triple`

— Method`henon_triple(x, p, n) → Function`

Iterate a 3D discrete system consisting of coupled Henon maps where the coupling is x1 → x2 → x3 [1]. This version allows for tweaking the parameters of the equations.

The difference equations are:

\[\begin{aligned} x_1(t+1) &= a - x_1(t)^2 + b x_1(t-2) \ x_2(t+1) &= a - c x_1(t) x_2(t)- (1 - c) x_2(t)^2 + b x_2(t-1) \ x_3(t+1) &= c x_2(t) x_3(t) - (1 - c) x_3(t)^2 + b x_3(t-1) \end{aligned}\]

Here $c$ is the coupling constant. The system becomes completely synchronized for $c >= 0.7$.

**References**

- Papana, A., Kyrtsou, C., Kugiumtzis, D., & Diks, C. (2013). Simulation study of

direct causality measures in multivariate time series. Entropy, 15(7), 2635–2661.

`CausalityTools.ikeda`

— Method```
ikeda(; u₀ = rand(2), c_xy = 1.0, c_yx = 1.0, a = 0.8, b = 12, c = 0.9,
r₁ = rand(Uniform(0.01, 0.3)), r₂ = rand(Uniform(0.01, 0.3)), σ = 0.05)
```

Initialise a discrete two-dimensional Ikeda map system, adapted from [1] by adding a noise term and allowing the influences from $x \to y$ ($c_{xy}$) and from $y \to x$ ($c_{yx}$) to be adjusted.

As a rule-of-thumb, if parameters `a`

, `b`

, and `c`

are drawn from uniform distributions on `[0.8, 1.5]`

, `[10, 14]`

and `[0.1, 0.9]`

.

The difference equations are

\[\begin{aligned} x(t+1) = 1 + \mu(x \cos{(\theta)} - c_{yx} y \sin{(\theta)}) - min(\dfrac{\sigma \xi_{t}^{(1)})}{(1-x)}, \xi_{t}^{(2)} \\ y(t+1) = \mu(y \cos{(\theta)} - c_{xy} x \sin{(\theta)}) - min(\dfrac{\sigma \zeta_{t}^{(1)})}{(1-y)}, \zeta_{t}^{(2)} \end{aligned}\]

**References**

- Cao, Liangyue, Alistair Mees, and Kevin Judd. "Modeling and predicting non-stationary time series." International Journal of Bifurcation and Chaos 7.08 (1997): 1823-1831.

`CausalityTools.independence`

— Method`independence(test::IndependenceTest, x, y, [z]) → summary`

Perform the given `IndependenceTest`

`test`

on data `x`

, `y`

and `z`

. If only `x`

and `y`

are given, `test`

must provide a bivariate association measure. If `z`

is given too, then `test`

must provide a conditional association measure.

Returns a test `summary`

, whose type depends on `test`

.

**Compatible independence tests**

`CausalityTools.infer_graph`

— Function`infer_graph(algorithm::GraphAlgorithm, x) → g`

Infer graph from input data `x`

using the given `algorithm`

.

Returns `g`

, whose type depends on `algorithm`

.

`CausalityTools.inv_cdf_transform`

— Method`inv_cdf_transform(x::AbstractVector, d::Distribution) → tx::Vector`

Map each value `xᵢ ∈ x`

to the transformed value `t(xᵢ) ∈ [0, 1]`

using the inverse cumulative distribution function (CDF) (i.e.e quantile function) of the distribution `d`

.

This function is meant to be used marginal empirical copula transforms (which are uniformly distributed). Since the inverse CDF is a strictly increasing function, the marginal copulas are preserved by the transformation.

`CausalityTools.jdd`

— Method`jdd(measure::JointDistanceDistribution, source, target) → Δ`

Compute the joint distance distribution Amigo2018 from `source`

to `target`

using the given `JointDistanceDistribution`

measure.

Returns the distribution `Δ`

from the paper directly (example). Use `JointDistanceDistributionTest`

to perform a formal indepencence test.

`CausalityTools.l_measure`

— Method`l_measure(measure::LMeasure, x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet)`

Compute the `LMeasure`

from source `x`

to target `y`

.

`CausalityTools.library_indices`

— Function`library_indices(measure::CCMLike, est::CrossmapEstimator, i::Int, target, source)`

Produce (randomly, if relevant) the `i`

-th subset of indices for a `CrossmapEstimator`

that is being applied to `target`

and `source`

.

`CausalityTools.linearmap1`

— Method`linearmap1(;u₀ = [1, rand(2)], c = 0.5) → DiscreteDynamicalSystem`

Linear map from Chen2004.

`CausalityTools.logistic2_bidir`

— Method```
logistic2_bidir(;u₀ = rand(2), c_xy = 0.1, c_yx = 0.1,
r₁ = 3.78, r₂ = 3.66, σ_xy = 0.05, σ_yx = 0.05)
```

A bidirectional logistic model for the chaotic population dynamics of two interacting species [1].

**Equations of motion**

The equations of motion are

\[\begin{align} x(t+1) &= r_1 f_{yx}^{t}(1 - f_{yx}^{t}) \\ y(t+1) &= r_2 f_{xy}^{t}(1 - f_{xy}^{t}) \\ f_{xy}^t &= \dfrac{y(t) + c_{xy}(x(t) + \sigma_{xy} \xi_{xy}^t )}{1 + c_{xy} (1 + \sigma_{xy} )} \\ f_{yx}^t &= \dfrac{x(t) + c_{yx}(y(t) + \sigma_{yx} \xi_{yx}^t )}{1 + c_{yx} (1 + \sigma_{yx} )}, \end{align}\]

where the coupling strength $c_{xy}$ controls how strongly species $x$ influences species $y$, and vice versa for $c_{yx}$. To simulate time-varying influence of unobserved processes, we use the dynamical noise terms $\xi_{xy}^t$ and $\xi_{yx}^t$, drawn from a uniform distribution with support on $[0, 1]$. If $\sigma_{xy} > 0$, then the influence of $x$ on $y$ is masked by dynamical noise equivalent to $\sigma_{xy} \xi_{xy}^{t}$ at the $t$-th iteration of the map, and vice versa for $\sigma_{yx}$.

`CausalityTools.logistic2_unidir`

— Method```
logistic2(;u₀ = rand(2), c_xy = 0.1, σ = 0.05,
r₁ = 3.78, r₂ = 3.66) → DiscreteDynamicalSystem
```

Initialise a system consisting of two coupled logistic maps where X unidirectionally influences Y. By default, the parameters `r₁`

and `r₂`

are set to values yielding chaotic behaviour.

**Equations of motion**

The equations of motion are

\[\begin{aligned} x(t+1) &= r_1 x(t)(1 - x(t)) \\ y(t+1) &= r_2 f(x,y)(1 - f(x,y)), \end{aligned}\]

with

\[\begin{aligned} f(x,y) = \dfrac{y + \frac{c_{xy}(x \xi )}{2}}{1 + \frac{c_{xy}}{2}(1+ \sigma )} \end{aligned}\]

The parameter `c_xy`

controls how strong the dynamical forcing is. If `σ > 0`

, dynamical noise masking the influence of `x`

on `y`

equivalent to $\sigma \cdot \xi$ is added at each iteration. Here,$\xi$ is a draw from a flat distribution on $[0, 1]$. Thus, setting `σ = 0.05`

is equivalent to add dynamical noise corresponding to a maximum of $5 \%$ of the possible range of values of the logistic map.

**References**

- Diego, David, Kristian Agasøster Haaga, and Bjarte Hannisdal. "Transfer entropy computation using the Perron-Frobenius operator." Physical Review E 99.4 (2019): 042212.

`CausalityTools.logistic3`

— Method```
logistic3(;u₀ = rand(3), r = 4,
σx = 0.05, σy = 0.05, σz = 0.05) → DiscreteDynamicalSystem
```

Initialise a dynamical system consisting of three coupled logistic map representing the response of two independent dynamical variables to the forcing from a common driver. The dynamical influence goes in the directions $Z \to X$ and $Z \to Y$.

**Equations of motion**

The equations of motion are

\[\begin{aligned} x(t+1) = (x(t)(r - r_1 x(t) - z(t) + σ_x η_x)) \mod 1 \\ y(t+1) = (y(t)(r - r_2 y(t) - z(t) + σ_y η_y)) \mod 1 \\ z(t+1) = (z(t)(r - r_3 z(t) + σ_z η_z)) \mod 1 \end{aligned}\]

Dynamical noise may be added to each of the dynamical variables by tuning the parameters `σz`

, `σx`

and `σz`

. Default values for the parameters `r₁`

, `r₂`

and `r₃`

are set such that the system exhibits chaotic behaviour, with `r₁ = r₂ = r₃ = 4`

.

**References**

- Runge, Jakob. Causal network reconstruction from time series: From theoretical assumptions to practical estimation, Chaos 28, 075310 (2018); doi: 10.1063/1.5025050

`CausalityTools.logistic4`

— Method```
logistic4(;u₀ = rand(4), r₁ = 3.9, r₂ = 3.6, r₃ = 3.6, r₄ = 3.8,
c₁₂ = 0.4, c₂₃ = 0.4, c₃₄ = 0.35) → DiscreteDynamicalSystem
```

Initialise a system of a transitive chain of four unidirectionally coupled logistic maps, where $y_1 \to y_2 \to y_3 \to y_4$ [1]. Default parameters are as in [1].

*Note: With the default parameters which are as in [1], for some initial conditions, this system wanders off to $\pm \infty$ for some of the variables. Make sure that you have a good realisation before using the orbit for anything.*

**Equations of motion**

\[\begin{aligned} y_1(t+1) &= y_1(t)(r_1 - r_1 y_1) \\ y_2(t+1) &= y_2(t)(r_2 - c_{12} y_1 - r_2 y_2) \\ y_3(t+1) &= y_3(t)(r_3 - c_{23} y_2 - r_3 y_3) \\ y_4(t+1) &= y_4(t)(r_4 - c_{34} y_3 - r_4 y_4) \end{aligned}\]

**References**

- Ye, Hao, et al. "Distinguishing time-delayed causal interactions using convergent cross mapping." Scientific reports 5 (2015): 14750

`CausalityTools.m_measure`

— Method`m_measure(measure::MMeasure, x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet)`

Compute the `MMeasure`

from source `x`

to target `y`

.

`CausalityTools.marginal_encodings`

— Function`marginal_encodings(est::ProbabilitiesEstimator, x::VectorOrStateSpaceSet...)`

Encode/discretize each input vector `xᵢ ∈ x`

according to a procedure determined by `est`

. Any `xᵢ ∈ X`

that are multidimensional (`StateSpaceSet`

s) will be encoded column-wise, i.e. each column of `xᵢ`

is treated as a timeseries and is encoded separately.

This is useful for computing any discrete information theoretic quantity, and is used internally by `contingency_matrix`

.

**Supported estimators**

`ValueHistogram`

. Bin visitation frequencies are counted in the joint space`XY`

, then marginal visitations are obtained from the joint bin visits. This behaviour is the same for both`FixedRectangularBinning`

and`RectangularBinning`

(which adapts the grid to the data). When using`FixedRectangularBinning`

, the range along the first dimension is used as a template for all other dimensions.`SymbolicPermutation`

. Each timeseries is separately`encode`

d according to its ordinal pattern.`Dispersion`

. Each timeseries is separately`encode`

d according to its dispersion pattern.

Many more implementations are possible. Each new implementation gives one new way of estimating the `ContingencyMatrix`

`CausalityTools.marginal_indices`

— Method`marginal_indices(x)`

Returns a column vector `v`

with the same number of elements as there are unique elements in `x`

. `v[i]`

is the indices of elements in `x`

matching `v[i]`

.

For example, if the third unique element in `x`

, and the element `u₃ = unique(x)[3]`

appears four times in `x`

, then `v[3]`

is a vector of four integers indicating the position of the elements matching `u₃`

.

`CausalityTools.marginal_probs_from_μ`

— Method`marginal_probs_from_μ(seleced_axes, visited_bins, iv::InvariantMeasure, inds_non0measure)`

Estimate marginal probabilities from a pre-computed invariant measure, given a set of visited bins, an invariant measure and the indices of the positive-measure bins. The indices in `selected_axes`

determines which marginals are selected.

`CausalityTools.max_inputs_vars`

— Method`max_inputs_vars(m::AssociationMeasure) → nmax::Int`

Return the maximum number of variables is that the measure can be computed for.

For example, `MIShannon`

cannot be computed for more than 2 variables.

`CausalityTools.max_segmentlength`

— Function`max_segmentlength(x::AbstractVector, measure::CrossmapMeasure)`

Given an input vector `x`

, which is representative of the size of the other input vectors too, compute the maximum segment/library length that can be used for predictions.

`CausalityTools.mcr`

— Method`mcr(m::MCR, x, y)`

Compute the association between `x`

and `y`

based on conditional probabilities of recurrence using the given `MCR`

`measure`

, where `x`

and `y`

can be either univariate timeseries or multivariate `StateSpaceSet`

s.

`CausalityTools.min_inputs_vars`

— Method`min_inputs_vars(m::AssociationMeasure) → nmin::Int`

Return the minimum number of variables is that the measure can be computed for.

For example, `CMIShannon`

requires 3 input variables.

`CausalityTools.mutualinfo`

— Method`mutualinfo([measure::MutualInformation], m::ContingencyMatrix) → mi::Real`

Estimate the mutual information between `x`

and `y`

, the variables corresponding to the columns and rows of the 2-dimensional contingency matrix `m`

, respectively.

Estimates the discrete version of the given `MutualInformation`

`measure`

from its direct definition (double-sum), using the probabilities from a pre-computed `ContingencyMatrix`

. If `measure`

is not given, then the default is `MIShannon()`

.

`CausalityTools.mutualinfo`

— Method`mutualinfo([measure::MutualInformation], est::DifferentialEntropyEstimator, x, y)`

Estimate the mutual information `measure`

between `x`

and `y`

by a sum of three entropy terms, without any bias correction, using any `DifferentialEntropyEstimator`

compatible with multivariate data. See examples here. If `measure`

is not given, then the default is `MIShannon()`

.

`DifferentialEntropyEstimator`

s have their own `base`

field which is not used here. Instead, this method creates a copy of `est`

internally, where `est.base`

is replaced by `measure.e.base`

. Therefore, use `measure`

to control the "unit" of the mutual information.

**Estimators**

Some `MutualInformation`

measures can be computed using a `DifferentialEntropyEstimator`

, provided it supports multivariate input data. These estimators compute mutual information as a sum of of entropy terms (with different dimensions), without any bias correction.

Estimator | Principle | `MIShannon` | `MITsallisFuruichi` | `MITsallisMartin` | `MIRenyiJizba` | `MIRenyiSurbu` |
---|---|---|---|---|---|---|

`Kraskov` | Nearest neighbors | ✓ | x | x | x | x |

`Zhu` | Nearest neighbors | ✓ | x | x | x | x |

`ZhuSingh` | Nearest neighbors | ✓ | x | x | x | x |

`Gao` | Nearest neighbors | ✓ | x | x | x | x |

`Goria` | Nearest neighbors | ✓ | x | x | x | x |

`Lord` | Nearest neighbors | ✓ | x | x | x | x |

`LeonenkoProzantoSavani` | Nearest neighbors | ✓ | x | x | x | x |

`CausalityTools.mutualinfo`

— Method`mutualinfo([measure::MutualInformation], est::MutualInformationEstimator, x, y)`

Estimate the mutual information `measure`

between `x`

and `y`

using the dedicated `MutualInformationEstimator`

`est`

. See examples here. If `measure`

is not given, then the default is `MIShannon()`

.

**Estimators**

Dedicated `MutualInformationEstimator`

s are either discrete, continuous, or a mixture of both. Typically, these estimators apply bias correction.

Estimator | Type | `MIShannon` |
---|---|---|

`GaussanMI` | Parametric | ✓ |

`KSG1` | Continuous | ✓ |

`KSG2` | Continuous | ✓ |

`GaoKannanOhViswanath` | Mixed | ✓ |

`GaoOhViswanath` | Continuous | ✓ |

`CausalityTools.mutualinfo`

— Method`mutualinfo([measure::MutualInformation], est::ProbabilitiesEstimator, x, y) → mi::Real ∈ [0, a]`

Estimate the mutual information between `x`

and `y`

using the discrete version of the given `measure`

, using the given `ProbabilitiesEstimator`

`est`

(which must accept multivariate data and have an implementation for `marginal_encodings`

). See examples here. If `measure`

is not given, then the default is `MIShannon()`

.

**Estimators**

The mutual information is computed as sum of three entropy terms, without any bias correction. The exception is when using `Contingency`

; then the mutual information is computed using a `ContingencyMatrix`

.

Joint and marginal probabilities are computed by jointly discretizing `x`

and `y`

using the approach given by `est`

(using `marginal_encodings`

), and obtaining marginal distributions from the joint distribution.

Estimator | Principle | `MIShannon` | `MITsallisFuruichi` | `MITsallisMartin` | `MIRenyiJizba` | `MIRenyiSarbu` |
---|---|---|---|---|---|---|

`Contingency` | Contingency table | ✓ | ✓ | ✓ | ✓ | ✓ |

`CountOccurrences` | Frequencies | ✓ | ✓ | ✓ | ✓ | ✖ |

`ValueHistogram` | Binning (histogram) | ✓ | ✓ | ✓ | ✓ | ✖ |

`SymbolicPermutation` | Ordinal patterns | ✓ | ✓ | ✓ | ✓ | ✖ |

`Dispersion` | Dispersion patterns | ✓ | ✓ | ✓ | ✓ | ✖ |

`CausalityTools.noise_brownian`

— Method```
noise_brownian(n::Int; lo = - 1, hi = 1)
noise_brownian(d::Distribution, n::Int)
```

Generate a signal consisting of `n`

steps of Brownian noise, generated as the zero-mean and unit standard deviation normalised cumulative sum of noise generated from a uniform distribution on `[lo, hi]`

. Optionally, a distribution `d`

from which to sample can be provided.

**Examples**

```
# Based on uncorrelated uniform noise
noise_brownian(100)
noise_brownian(100, lo = -2, hi = 2)
noise_brownian(Uniform(-3, 3), 100)
# Based on uncorrelated Gaussian noise
μ, σ = 0, 2
noise_brownian(Normal(μ, σ), 100)
```

`CausalityTools.noise_ug`

— Method`noise_ug(n::Int; μ = 0, σ = 1)`

Generate a signal consisting of `n`

steps of uncorrelated Gaussian noise from a normal distribution with mean `μ`

and standard deviation `σ`

.

`CausalityTools.noise_uu`

— Method`noise_uu(n::Int, lo = - 1, hi = 1)`

Generate a signal consisting of `n`

steps of uncorrelated uniform noise from a uniform distribution on `[lo, hi]`

.

`CausalityTools.nonlinear3d`

— Method```
nonlinear3d(;u₀ = rand(3),
σ₁ = 1.0, σ₂ = 1.0, σ₃ = 1.0,
a₁ = 3.4, a₂ = 3.4, a₃ = 3.4,
b₁ = 0.4, b₂ = 0.4, b₃ = 0.4,
c₁₂ = 0.5, c₂₃ = 0.3, c₁₃ = 0.5) → DiscreteDynamicalSystem
```

A 3d nonlinear system with nonlinear couplings $x_1 \to x_2$, $x_2 \to x_3$ and $x_1 \to x_3$. Modified from [1].

**Equations of motion**

The equations of motion are

\[\begin{aligned} x_1(t+1) &= a_1 x_1 (1-x_1(t))^2 e^{-x_2(t)^2} + 0.4 \xi_{1}(t) \\ x_2(t+1) &= a_1 x_2 (1-x_2(t))^2 e^{-x_2(t)^2} + 0.4 \xi_{2}(t) + b x_1 x_2 \\ x_3(t+1) &= a_3 x_3 (1-x_3(t))^2 e^{-x_3(t)^2} + 0.4 \xi_{3}(t) + c x_{2}(t) + d x_{1}(t)^2. \end{aligned}\]

**References**

- Gourévitch, B., Le Bouquin-Jeannès, R., & Faucon, G. (2006). Linear and nonlinear causality between signals: methods, examples and neurophysiological applications. Biological Cybernetics, 95(4), 349–369.

`CausalityTools.nontrivial_pegiun`

— Method```
nontrivial_pegiun(;u₀ = rand(2), σ₁ = 0.1, σ₂ = 0.1,
p₁ = 0.7, p₂ = 0.1, p₃ = 0.4, p₄ = 2.4, p₅ = 0.9, p₆ = 4, n = 100) → StateSpaceSet
```

A 2D discrete autoregressive system with nonlinear, nontrivial coupling from [1] . This system is from 1, and was also studied in 2. The version implemented here allows for tweaking the parameters of the equations. The difference equations are

\[\begin{aligned} x(t+1) &= p_2 + p_3 x(t-2) + c_{yx}\dfrac{p_4 - p_5 y(t-3)}{1 + e^{-p_6 y(t-3)}} + \xi_1(t) \ y(t+1) &= p_1 y(t) + \xi_2(t). \end{aligned}\]

Here, $\xi_{1,2}(t)$ are two independent normally distributed noise processes with zero mean and standard deviations $\sigma_1$ and $\sigma_2$. The $\xi_{1,2}(t)$ terms represent dynamical noise.

**References**

[1] Péguin-Feissolle, A., & Teräsvirta, T. (1999). A General Framework for Testing the Granger Noncausaality Hypothesis. Universites d’Aix-Marseille II et III. https://www.amse-aixmarseille.fr/sites/default/files/_dt/greqam/99a42.pdf

[2] Chávez, M., Martinerie, J., & Le Van Quyen, M. (2003). Statistical assessment of nonlinear causality: application to epileptic EEG signals. Journal of Neuroscience Methods, 124(2), 113–128. doi:10.1016/s0165-0270(02)00367-9 https://www.sciencedirect.com/science/article/pii/S0165027002003679

`CausalityTools.optimize_marginals_te`

— Function`optimize_marginals_te([scheme = OptimiseTraditional()], s, t, [c]) → EmbeddingTE`

Optimize marginal embeddings for transfer entropy computation from source time series `s`

to target time series `t`

, conditioned on `c`

if `c`

is given, using the provided optimization `scheme`

.

`CausalityTools.optimize_marginals_te`

— Method`optimize_marginals_te(opt::OptimiseTraditional, s, t, [c]; exclude_source = false) → EmbeddingTE`

Optimise the marginals for a transfer entropy analysis from source time series `s`

to target time series `t`

, potentially given a conditional time series `c`

.

If `exclude_source == true`

, then no optimisation is done for the source. This is useful for `SurrogateTest`

, because most surrogate methods accept univariate time series, and if we embed the source and it becomes multidimensional, then we can't create surrogates. A future optimization is to do column-wise surrogate generation.

`CausalityTools.pai`

— Method```
pai(x, y, d, τ; w = 0, correspondence_measure = Statistics.cor) → Float64
pai(x, y, d, τ, bootstrap_method::Symbol; w = 0, correspondence_measure = Statistics.cor,
method = :segment, L = ceil(Int, (length(x)-d*τ)*0.2), nreps = 100) → Vector{Float64}
```

This syntax is deprecated. It will continue to work for CausalityTools v1.X, but will be removed in CausalityTools v2. See here for updated syntax.

Compute the pairwise asymmetric inference (PAI; McCracken2014) between `x`

and `y`

. Returns the correspondence between original and cross mapped values (the default is `ρ = correspondence_measure(y(t), ỹ(t) | M_xy)`

).

PAI is a modification to Sugihara2012's CCM algorithm, where instead of using completely out-of-sample prediction when trying to predict $y(t)$, values about *both* variables are included in the embedding used to make predictions. Specifically, PAI computes the correspondence between the values $y(t)$ and the cross-map estimated values $ỹ(t) | M_xy$, where the $\tilde{y}(t)$ are the values estimated using the embedding $M_{xy} = \{ ( x_t, x_{t-\tau}, x_{t-2\tau}, \ldots, x_{t-(d - 1)\tau} ) \}$. *Note: a `d+1`

-dimensional embedding is used, rather than the `d`

-dimensional embedding used for CCM. Like for the CCM algorithm, the Theiler window `r`

indicates how many temporal neighbors of the predictee is to be excluded during the nearest neighbors search (the default `r = 0`

excludes only the predictee itself, while `r = 2`

excludes the point itself plus its two nearest neighbors in time).

If `bootstrap_method`

is specified, then `nreps`

different bootstrapped estimates of `correspondence_measure(y(t), ỹ(t) | M_x)`

are returned. The following bootstrap methods are available:

`bootstrap_method = :random`

selects training sets of length`L`

consisting of randomly selected points from the embedding $M_x$ (time ordering does not matter). This is method 3 from Luo2015, which critiqued the original Sugihara2012 methodology.`bootstrap_method = :segment`

selects training sets consisting of time-contiguous segments (each of lenght`L`

) of embedding vectors in $M_x$ (time ordering matters). This is method 2 from Luo et al. (2015)Luo2015.

`CausalityTools.pairwise_test`

— Method`pairwise_test(p::OCESelectedParents) → pairwise::Bool`

If the parent set is empty, return `true`

(a pairwise test should be performed). If the parent set is nonempty, return `false`

(a conditional test should performed).

`CausalityTools.partial_correlation`

— Method```
partial_correlation(x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet,
z::VectorOrStateSpaceSet...)
```

Compute the `PartialCorrelation`

between `x`

and `y`

, given `z`

.

`CausalityTools.partial_correlation_from_precision`

— Method`partial_correlation_from_precision(P, i::Int, j::Int)`

Given a precision matrix `P`

, compute the partial correlation of variables `i`

and `j`

conditional on all other variables.

`CausalityTools.pearson_correlation`

— Method`pearson_correlation(x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet)`

Compute the `PearsonCorrelation`

between `x`

and `y`

, which must each be 1-dimensional.

`CausalityTools.pmi`

— Method`pmi([measure::CMI], est::ProbabilitiesEstimator, x, y, z) → pmi_est::Real ∈ [0, a)`

Estimate the part mutual information (`PMI`

; Zhao2016).

If `measure`

is not given, then the default is `PMI(; base = 2)`

. With a `ProbabilitiesEstimator`

, the returned `pmi_est`

is guaranteed to be non-negative.

**Estimators**

Estimator | Principle | `PMI` |
---|---|---|

`CountOccurrences` | Frequencies | ✓ |

`ValueHistogram` | Binning (histogram) | ✓ |

`SymbolicPermutation` | Ordinal patterns | ✓ |

`Dispersion` | Dispersion patterns | ✓ |

`CausalityTools.point_density!`

— Method```
point_density!(S₁, S₂, est::LocalLikelihood, xᵢ, bwᵢ,
neighborsᵢ::AbstractStateSpaceSet{D}) where D
```

Estimate the density around point `xᵢ`

using a local likehood estimator, which is a generalization of kernel density estimation. This is done by fitting a local gaussian distribution around `xᵢ`

from its local neighborhood (represented the points `neighborsᵢ`

). The bandwidth `bwᵢ`

is given by the distance from `xᵢ`

to its `k`

-th nearest neighbor.

`S₁`

is a pre-allocated length-`D`

vector which holds the means, and `S₂`

is a pre-allocated `D`

-by-`D`

matrix which holds the covariances. Both `S₁`

and `S₂`

are zeroed every time `point_density!`

is called.

`CausalityTools.predict`

— Method```
predict(measure::CrossmapMeasure, t::AbstractVector, s::AbstractVector) → t̂ₛ, t̄, ρ
predict(measure::CrossmapMeasure, t̄::AbstractVector, S̄::AbstractStateSpaceSet) → t̂ₛ
```

Perform point-wise cross mappings between source embeddings and target time series according to the algorithm specified by the given cross-map `measure`

(e.g. `ConvergentCrossMapping`

or `PairwiseAsymmetricInference`

).

**First method**: Jointly embeds the target`t`

and source`s`

time series (according to`measure`

) to obtain time-index aligned target timeseries`t̄`

and source embedding`S̄`

(which is now a`StateSpaceSet`

). Then calls`predict(measure, t̄, S̄)`

(the first method), and returns both the predictions`t̂ₛ`

, observations`t̄`

and their correspondence`ρ`

according to`measure`

.**Second method**: Returns a vector of predictions`t̂ₛ`

(`t̂ₛ`

:= "predictions of`t̄`

based on source embedding`S̄`

"), where`t̂ₛ[i]`

is the prediction for`t̄[i]`

. It assumes pre-embedded data which have been correctly time-aligned using a joint embedding (see`embed`

), i.e. such that`t̄[i]`

and`S̄[i]`

correspond to the same time index.

**Description**

For each `i ∈ {1, 2, …, N}`

where `N = length(t) == length(s)`

, we make the prediction `t̂[i]`

(an estimate of `t[i]`

) based on a linear combination of `D + 1`

other points in `t`

, where the selection of points and weights for the linear combination are determined by the `D+1`

nearest neighbors of the point `S̄[i]`

. The details of point selection and weights depend on `measure`

.

*Note: Some CrossmapMeasures may define more general mapping procedures. If so, the algorithm is described in their docstring*.

`CausalityTools.predictive_asymmetry`

— Function```
predictive_asymmetry(estimator::TransferEntropyEstimator, ηs; s, t, [c],
dTf = 1, dT = 1, dS = 1, τT = -1, τS = -1, [dC = 1, τC = -1],
normalize::Bool = false, f::Real = 1.0, base = 2) → Vector{Float64}
```

Compute the predictive asymmetry Haaga2020 𝔸(`s`

→ `t`

) for source time series `s`

and target time series `t`

over prediction lags `ηs`

, using the given `estimator`

and embedding parameters `dTf`

, `dT`

, `dS`

, `τT`

, `τS`

(see also `EmbeddingTE`

)

If a conditional time series `c`

is provided, compute 𝔸(`s`

→ `t`

| `c`

). Then, `dC`

and `τC`

controls the embedding dimension and embedding lag for the conditional variable.

**Returns**

Returns a vector containing the predictive asymmetry for each value of `ηs`

.

**Normalization (hypothesis test)**

If `normalize == true`

(the default), then compute the normalized predictive asymmetry 𝒜. In this case, for each $\eta$ in `ηs`

, compute 𝒜(η) by normalizing 𝔸(η) to some fraction `f`

of the mean transfer entropy over prediction lags $-\eta, ..., \eta$ (exluding lag 0). Haaga2020 uses a normalization with `f=1.0`

as a built-in hypothesis test, avoiding more computationally costly surrogate testing.

**Estimators**

Any estimator that works for `transferentropy`

will also work with `predictive_asymmetry`

. Check the online documentation for compatiable estimators.

**Examples**

```
using CausalityTools
# Some example time series
x, y = rand(100), rand(100)
# 𝔸(x → y) over prediction lags 1:5
𝔸reg = predictive_asymmetry(x, y, VisitationFrequency(RectangularBinning(5)), 1:5)
```

This is a method that does not yet appear in a peer-reviewed scientific journal. Feel free to use, but consider it experimental for now. It will reappear in a 2.X release in new form once published in a peer-reviewed journal.

`CausalityTools.probability`

— Method`probability(k::MultivariateKernel, data::AbstractStateSpaceSet, xᵢ; h) → p̂(xᵢ)`

Compute `p̂(xᵢ)`

, the kernel density estimate of the probability `p(xᵢ)`

, given some (multivariate) `data`

and the query point `xᵢ`

.

This is fast if `xᵢ`

is an `SVector`

.

`CausalityTools.rank_transformation`

— Method```
rank_transformation(x::AbstractVector)
rank_transformation(x::AbstractStateSpaceSet) → ranks::NTuple{D, Vector}
```

Rank-transform each variable/column of the length-`n`

`D`

-dimensional StateSpaceSet `x`

and return the rank-transformed variables as a `D`

-tuple of length-`n`

vectors.

Returns the unscaled `ranks`

. Divide by `n`

to get an *approximation* to the empirical cumulative distribution function (ECDF) `x`

.

**Description**

Modulo division by `n`

, `rank_transformation`

does *roughly* the same as naively computing the ECDF as

`[count(xᵢ .<= x) for xᵢ in x] / length(x)`

but an order of magnitude faster and with roughly three orders of magnitude less allocations. The increased efficiency of this function relative to naively computing the ECDF is because it uses sorting of the input data to determine ranks, arbitrarily breaking ties according to the sorting algorithm. Rank ties can therefore never occur, and equal values are assigned different but close ranks. To preserve ties, which you might want to do for example when dealing with categorical or integer-valued data, use (the much slower) `empcdf`

.

`CausalityTools.rmcd`

— Method```
rmcd(measure::RMCD, x, y)
rmcd(measure::RMCD, x, y, [z, ...])
```

Estimate the recurrence-based `measure`

of dependence between `x`

and `y`

, conditional on `z`

if given.

Parameters for recurrence matrix estimation are given as a `RMCD`

instance. Inputs `x`

, `y`

, `z`

can be either univariate timeseries or multivariate `StateSpaceSet`

s.

`CausalityTools.s_measure`

— Method`s_measure(measure::SMeasure, x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet)`

Compute the `SMeasure`

from source `x`

to target `y`

.

`CausalityTools.s_measure`

— Method`s_measure(measure::SMeasure, x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet) → s ∈ [0, 1]`

Compute the given `measure`

to quantify the directional dependence between univariate/multivariate time series `x`

and `y`

.

Returns a scalar `s`

where `s = 0`

indicates independence between `x`

and `y`

, and higher values indicate synchronization between `x`

and `y`

, with complete synchronization for `s = 1.0`

.

**Example**

```
using CausalityTools
# A two-dimensional Ulam lattice map
sys = ulam(2)
# Sample 1000 points after discarding 5000 transients
orbit = trajectory(sys, 1000, Ttr = 5000)
x, y = orbit[:, 1], orbit[:, 2]
# 4-dimensional embedding for `x`, 5-dimensional embedding for `y`
m = SMeasure(dx = 4, τx = 3, dy = 5, τy = 1)
s_measure(m, x, y)
```

`CausalityTools.select_parents`

— Method`select_parents(alg::OCE, x)`

The parent selection step of the `OCE`

algorithm, which identifies the parents of each `xᵢ ∈ x`

, assuming that `x`

must be integer-indexable, i.e. `x[i]`

yields the `i`

-th variable.

`CausalityTools.skeleton`

— Method`skeleton(algorithm::PC, x) → (g, s)`

Infer the skeleton graph for the variables `x`

using the provided `algorithm`

. `x`

must be some iterable where the `i`

-th variable can be accessed as `x[i]`

.

Return a tuple of the undirected skeleton graph `g::SimpleGraph`

, and the separating sets `s::Dict{SimpleEdge, Vector{Int}})`

.

`CausalityTools.skeleton_conditional!`

— Method`skeleton_conditional!(alg::PC, graph, x, conditional_test::IndependenceTest)`

Thin the skeleton `graph`

, where each vertex is represented by the data `x[i]`

, by using `conditional_test`

. Whenever `x[i] ⫫ x[j] | x[S]`

for some set of variables not including `i`

and `j`

, the edge between `i`

and `j`

is removed, and `S`

is stored in the separating set for `i`

and `j`

. This is essentially algorithm 3.2 in Colombo2014, for the cases `𝓁 >= 1`

.

Modifies `graph`

in-place.

`CausalityTools.skeleton_unconditional!`

— Method`skeleton_unconditional!(alg::PC, g::SimpleGraph, x)`

Perform pairwise independence tests between all vertices in the `graph`

, where each vertex is represented by the data `x[i]`

, and remove any edges in `graph`

where adjacent vertices are found to be independent according to the given independence `test`

. The null hypothesis of independence is rejected whenever the p-value is below `α`

.

This is essentially algorithm 3.2 in Colombo2014, but only considering the case `𝓁 = 0`

.

Modifies `graph`

in-place.

If `alg.pairwise_test`

is a directed test, then edges are considered one-by-one. If `alg.pairwise_test`

is not a directed test, then edges (`X → Y,`

Y → X`) are considered simultaneously.

`CausalityTools.system`

— Method```
system(definition::DiscreteDefinition) → s::DiscreteDynamicalSystem
system(definition::ContinuousDefinition) → s::ContinuousDynamicalSystem
```

Initialize a dynamical system from `definition`

.

`CausalityTools.te_embed`

— Method```
te_embed(source::AbstractVector{T}, target::AbstractVector{T}, p::EmbeddingTE) → (points, vars, τs)
te_embed(source::AbstractVector{T}, target::AbstractVector{T}, cond::AbstractVector{T}, p::EmbeddingTE) → (points, vars, τs)
```

Generalised delay reconstruction of `source`

and `target`

(and `cond`

if provided) for transfer entropy computation using embedding parameters provided by the `EmbeddingTE`

instance `p`

.

Returns a tuple of the embedded `points`

, `vars`

(a `TEVars`

instance that keeps track of which variables of the embedding belong to which marginals of the reconstruction; indices are: source = 1, target = 2, cond = 3), and a tuple `τs`

, which stores the lags for each variable of the reconstruction.

`CausalityTools.tolevels!`

— Method```
tolevels!(levels, x) → levels, dict
tolevels(x) → levels, dict
```

Apply the bijective map $f : \mathcal{Q} \to \mathbb{N}^+$ to each `x[i]`

and store the result in `levels[i]`

, where `levels`

is a pre-allocated integer vector such that `length(x) == length(levels)`

.

$\mathcal{Q}$ can be any space, and each $q \in \mathcal{Q}$ is mapped to a unique integer in the range `1, 2, …, length(unique(x))`

. This is useful for integer-encoding categorical data such as strings, or other complex data structures.

The single-argument method allocated a `levels`

vector internally.

`dict`

gives the inverse mapping.

`CausalityTools.transferentropy`

— Function```
transferentropy([measure::TEShannon], est, s, t, [c])
transferentropy(measure::TERenyiJizba, est, s, t, [c])
```

Estimate the transfer entropy $TE^*(S \to T)$ or $TE^*(S \to T | C)$ if `c`

is given, using the provided estimator `est`

, where $*$ indicates the given `measure`

. If `measure`

is not given, then `TEShannon(; base = 2)`

is the default.

**Arguments**

: The transfer entropy measure, e.g.`measure`

`TEShannon`

or`TERenyi`

, which dictates which formula is computed. Embedding parameters are stored in`measure.embedding`

, and is represented by an`EmbeddingTE`

instance. If calling`transferentropy`

without giving`measure`

, then the embedding is optimized by finding suitable delay embedding parameters using the "traditional" approach from DynamicalSystems.jl.: The source timeseries.`s`

: The target timeseries.`t`

: Optional. A conditional timeseries.`c`

**Description**

The Shannon transfer entropy is defined as $TE^S(S \to T | C) := I^S(T^+; S^- | T^-, C^-)$, where $I^S(T^+; S^- | T^-, C^-)$ is `CMIShannon`

, and marginals for the CMI are constructed as described in `EmbeddingTE`

. The definition is analogous for `TERenyiJizba`

.

If `s`

, `t`

, and `c`

are univariate timeseries, then the the marginal embedding variables $T^+$ (target future), $T^-$ (target present/past), $S^-$ (source present/past) and $C^-$ (present/past of conditioning variables) are constructed by first jointly embedding `s`

, `t`

and `c`

with relevant delay embedding parameters, then subsetting relevant columns of the embedding.

Since estimates of $TE^*(S \to T)$ and $TE^*(S \to T | C)$ are just a special cases of conditional mutual information where input data are marginals of a particular form of delay embedding, *any* combination of variables, e.g. $S = (A, B)$, $T = (C, D)$, $C = (D, E, F)$ are valid inputs (given as `StateSpaceSet`

s). In practice, however, `s`

, `t`

and `c`

are most often timeseries, and if `s`

, `t`

and `c`

are `StateSpaceSet`

s, it is assumed that the data are pre-embedded and the embedding step is skipped.

**Compatible estimators**

`transferentropy`

is just a simple wrapper around `condmutualinfo`

that constructs an appropriate delay embedding from the input data before CMI is estimated. Consequently, any estimator that can be used for `ConditionalMutualInformation`

is, in principle, also a valid transfer entropy estimator. `TransferEntropyEstimator`

s are the exception - they compute transfer entropy directly.

`CausalityTools.ulam`

— Function`ulam(D::Int = 10; u₀ = rand(D), ε::Real = 0.10) → DiscreteDynamicalSystem`

A lattice of `D`

unidirectionally coupled ulam maps Schreiber2000 defined as

\[x^{m}_{t+1} = f(\epsilon x^{m-1}_{t} + (1 - \epsilon) x_{t}^{m}),\]

where $m = 1, 2, \ldots, D$ and $f(x) = 2 - x^2$. In this system, information transfer happens only in the direction of increasing $m$.

`CausalityTools.update_states!`

— Method`update_states!(s::LaggedDiscreteDefinition, xnew::SVector{D})`

Given `xnew`

(the new current state of a system), update the past states of `s`

.

`CausalityTools.var1`

— Method`var1(x, p, n) → DiscreteDynamicalSystem`

Initialise a discrete vector autoregressive system where X₁ → X₂ → X₃.

`CausalityTools.verdes`

— Method```
verdes(;u₀ = rand(3), ωy = 315, ωz = 80,
σx = 0.0, σy = 0.0, σz = 0.0) → DiscreteDynamicalSystem
```

Intitialise a 3D system where the response X is a highly nonlinear combination of Y and Z Verdes2005. The forcings Y and Z involve sines and cosines, and have different periods, which controlled by `ωy`

and `ωz`

.

The equations of motion are

\[\begin{aligned} x(t+1) &= \dfrac{y(t)(18y(t) - 27y(t)^2 + 10)}{2} + z(t)(1-z(t)) + ηx \ y(t+1) &= \dfrac{(1 - \dfrac{\cos(2\pi)}{\omega y}t)}{2} + ηy \ z(t+1) &= \dfrac{(1 - \dfrac{\sin(2\pi)}{\omega z}t)}{2} + ηz \end{aligned}\]

where ηx, ηy, ηz is gaussian noise with mean 0 and standard deviation `σx`

, `σy`

and `σz`

.

`DelayEmbeddings.embed`

— Method```
embed(measure::CrossmapMeasure, target::AbstractVector,
sources::AbstractVector...) → emb::StateSpaceSet{D}
```

Jointly embed the input vector `target`

, together with one or more vectors `s ∈ sources`

, according to the given `CrossmapMeasure`

.

This produces `emb`

, a `D`

-dimensional `StateSpaceSet`

where

- The last column is always the non-lagged
`target`

variable. Typically, this is the variable we're trying to predict. - The
`D-1`

first columns are the (non)lagged versions of each source time series`s ∈ sources`

. Typically,`emb[:, 1:D-1]`

is the subspace in which neighborhood searches are done, which forms the basis of cross-map`predict`

ions.

`StatsAPI.pvalue`

— Method`pvalue(test::CorrTest, z, c::Int, n::Int)`

Compute the two-sided p-value for the test of the partial correlation coefficient `p̂`

being zero, where `c`

is the cardinality of the conditioning set and `n`

is the number of samples.

- Paluš2018Paluš, M., Krakovská, A., Jakubík, J., & Chvosteková, M. (2018). Causality, dynamical systems and the arrow of time. Chaos: An Interdisciplinary Journal of Nonlinear Science, 28(7), 075307. http://doi.org/10.1063/1.5019944
- Póczos2012Póczos, B., & Schneider, J. (2012, March). Nonparametric estimation of conditional information and divergences. In Artificial Intelligence and Statistics (pp. 914-923). PMLR.
- Székely2007Székely, G. J., Rizzo, M. L., & Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. The annals of statistics, 35(6), 2769-2794.
- Székely2014Székely, G. J., & Rizzo, M. L. (2014). Partial distance correlation with methods for dissimilarities.
- Krakovská2018Krakovská, A., Jakubík, J., Chvosteková, M., Coufal, D., Jajcay, N., & Paluš, M. (2018). Comparison of six methods for the detection of causality in a bivariate time series. Physical Review E, 97(4), 042207.
- Amigó2018Amigó, José M., and Yoshito Hirata. "Detecting directional couplings from multivariate flows by the joint distance distribution." Chaos: An Interdisciplinary Journal of Nonlinear Science 28.7 (2018): 075302.
- Amigó2018Amigó, José M., and Yoshito Hirata. "Detecting directional couplings from multivariate flows by the joint distance distribution." Chaos: An Interdisciplinary Journal of Nonlinear Science 28.7 (2018): 075302.
- Krakovská2018Krakovská, Anna, et al. "Comparison of six methods for the detection of causality in a bivariate time series." Physical Review E 97.4 (2018): 042207
- Pál2010Pál, D., Póczos, B., & Szepesvári, C. (2010). Estimation of Rényi entropy and mutual information based on generalized nearest-neighbor graphs. Advances in Neural Information Processing Systems, 23.
- Chávez2003Chávez, M., Martinerie, J., & Le Van Quyen, M. (2003). Statistical assessment of nonlinear causality: application to epileptic EEG signals. Journal of Neuroscience Methods, 124(2), 113–128. doi:10.1016/s0165-0270(02)00367-9 https://www.sciencedirect.com/science/article/pii/S0165027002003679
- Póczos2012Póczos, B., & Schneider, J. (2012, March). Nonparametric estimation of conditional information and divergences. In Artificial Intelligence and Statistics (pp. 914-923). PMLR.
- Krakovská2018Krakovská, A., Jakubík, J., Chvosteková, M., Coufal, D., Jajcay, N., & Paluš, M. (2018). Comparison of six methods for the detection of causality in a bivariate time series. Physical Review E, 97(4), 042207.
- Amigó2018Amigó, José M., and Yoshito Hirata. "Detecting directional couplings from multivariate flows by the joint distance distribution." Chaos: An Interdisciplinary Journal of Nonlinear Science 28.7 (2018): 075302.
- Paluš2018Paluš, M., Krakovská, A., Jakubík, J., & Chvosteková, M. (2018). Causality, dynamical systems and the arrow of time. Chaos: An Interdisciplinary Journal of Nonlinear Science, 28(7), 075307. http://doi.org/10.1063/1.5019944
- Székely2007Székely, G. J., Rizzo, M. L., & Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. The annals of statistics, 35(6), 2769-2794.
- Székely2014Székely, G. J., & Rizzo, M. L. (2014). Partial distance correlation with methods for dissimilarities.
- Székely2007Székely, G. J., Rizzo, M. L., & Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. The annals of statistics, 35(6), 2769-2794.
- Krakovská2018Krakovská, A., Jakubík, J., Chvosteková, M., Coufal, D., Jajcay, N., & Paluš, M. (2018). Comparison of six methods for the detection of causality in a bivariate time series. Physical Review E, 97(4), 042207.