CrossMappings.ccmMethod
ccm(source,
        target,
        timeseries_lengths;
        kwargs...) -> Vector{Vector{Float64}}

Algorithm

Compute the cross mapping between a source series and a target series over different timeseries_lengths.

Arguments

  • source: The data series representing the putative source process.
  • target: The data series representing the putative target process.
  • timeseries_lengths: Time series length(s) for which to compute the cross mapping(s).

Keyword arguments to crossmap

  • dim: The dimension of the state space reconstruction (delay embedding) constructed from the target series. Default is dim = 3.
  • τ: The embedding lag for the delay embedding constructed from target. Default is τ = 1.
  • η: The prediction lag to use when predicting scalar values of source fromthe delay embedding of target. η > 0 are forward lags (causal; source's past influences target's future), and η < 0 are backwards lags (non-causal; source's' future influences target's past). Adjust the prediction lag if you want to performed lagged ccm (Ye et al., 2015). Default is η = 0, as in Sugihara et al. (2012). Note: The sign of the lag η is organized to conform with the conventions in TransferEntropy.jl, and is opposite to the convention used in the rEDM package (Ye et al., 2016).
  • libsize: Among how many delay embedding points should we sample time indices and look for nearest neighbours at each cross mapping realization (of which there are n_reps)?
  • n_reps: The number of times we draw a library of libsize points from the delay embedding of target and try to predict source values. Equivalently, how many times do we cross map for this value of libsize? Default is n_reps = 100.
  • replace: Sample delay embedding points with replacement? Default is replace = true.
  • theiler_window: How many temporal neighbors of the delay embedding point target_embedding(t) to exclude when searching for neighbors to determine weights for predicting the scalar point source(t + η). Default is theiler_window = 0.
  • tree_type: The type of tree to build when looking for nearest neighbors. Must be a tree type from NearestNeighbors.jl. For now, this is either BruteTree, KDTree or BallTree.
  • distance_metric: An instance of a Metric from Distances.jl. BallTree and BruteTree work with any Metric. KDTree only works with the axis aligned metrics Euclidean, Chebyshev, Minkowski and Cityblock. Default is metric = Euclidean() (note the instantiation of the metric).
  • correspondence_measure: The function that computes the correspondence between actual values of source and predicted values. Can be any function returning a similarity measure between two vectors of values. Default is correspondence_measure = StatsBase.cor, which returns values on $[-1, 1]$. In this case, any negative values are usually filtered out (interpreted as zero coupling) and a value of $1$ means perfect prediction. Sugihara et al. (2012) also proposes to use the root mean square deviation, for which a value of $0$ would be perfect prediction.

References

Sugihara, George, et al. "Detecting causality in complex ecosystems." Science (2012): 1227079. http://science.sciencemag.org/content/early/2012/09/19/science.1227079

Ye, Hao, et al. "Distinguishing time-delayed causal interactions using convergent cross mapping." Scientific Reports 5 (2015): 14750. https://www.nature.com/articles/srep14750

Ye, H., et al. "rEDM: Applications of empirical dynamic modeling from time series." R Package Version 0.4 7 (2016). https://cran.r-project.org/web/packages/rEDM/index.html

CrossMappings.ccm_with_summaryMethod
ccm_with_summary(source,
        target,
        timeseries_lengths;
        average_measure::Symbol = :median,
        uncertainty_measure::Symbol = :quantile,
        quantiles = [0.327, 0.673],
        kwargs...)

Algorithm

Compute the cross mapping between a source series and a target series over different timeseries_lengths and return summary statistics of the results.

Arguments

  • source: The data series representing the putative source process.
  • target: The data series representing the putative target process.
  • timeseries_lengths: Time series length(s) for which to compute the cross mapping(s).

Summary keyword arguments

  • average_measure: Either :median or :mean. Default is :median.
  • uncertainty_measure: Either :quantile or :std. Default is :quantile.
  • quantiles: Compute uncertainty over quantile(s) if uncertainty_measure is :quantile. Default is [0.327, 0.673], roughly corresponding to 1s for normally distributed data.

Keyword arguments to crossmap

  • dim: The dimension of the state space reconstruction (delay embedding) constructed from the target series. Default is dim = 3.
  • τ: The embedding lag for the delay embedding constructed from target. Default is τ = 1.
  • η: The prediction lag to use when predicting scalar values of source from the delay embedding of target. η > 0 are forward lags (causal; source's past influences target's future), and η < 0 are backwards lags (non-causal; source's' future influences target's past). Adjust the prediction lag if you want to performed lagged ccm (Ye et al., 2015). Default is η = 0, as in Sugihara et al. (2012). Note: The sign of the lag η is organized to conform with the conventions in TransferEntropy.jl, and is opposite to the convention used in the rEDM package (Ye et al., 2016).
  • libsize: Among how many delay embedding points should we sample time indices and look for nearest neighbours at each cross mapping realization (of which there are n_reps)?
  • n_reps: The number of times we draw a library of libsize points from the delay embedding of target and try to predict source values. Equivalently, how many times do we cross map for this value of libsize? Default is n_reps = 100.
  • replace: Sample delay embedding points with replacement? Default is replace = true.
  • theiler_window: How many temporal neighbors of the delay embedding point target_embedding(t) to exclude when searching for neighbors to determine weights for predicting the scalar point source(t + η). Default is theiler_window = 0.
  • tree_type: The type of tree to build when looking for nearest neighbors. Must be a tree type from NearestNeighbors.jl. For now, this is either BruteTree, KDTree or BallTree.
  • distance_metric: An instance of a Metric from Distances.jl. BallTree and BruteTree work with any Metric. KDTree only works with the axis aligned metrics Euclidean, Chebyshev, Minkowski and Cityblock. Default is metric = Euclidean() (note the instantiation of the metric).
  • correspondence_measure: The function that computes the correspondence between actual values of source and predicted values. Can be any function returning a similarity measure between two vectors of values. Default is correspondence_measure = StatsBase.cor, which returns values on $[-1, 1]$. In this case, any negative values are usually filtered out (interpreted as zero coupling) and a value of $1$ means perfect prediction. Sugihara et al. (2012) also proposes to use the root mean square deviation, for which a value of $0$ would be perfect prediction.

References

Sugihara, George, et al. "Detecting causality in complex ecosystems." Science (2012): 1227079. http://science.sciencemag.org/content/early/2012/09/19/science.1227079

Ye, Hao, et al. "Distinguishing time-delayed causal interactions using convergent cross mapping." Scientific Reports 5 (2015): 14750. https://www.nature.com/articles/srep14750

Ye, H., et al. "rEDM: Applications of empirical dynamic modeling from time series." R Package Version 0.4 7 (2016). https://cran.r-project.org/web/packages/rEDM/index.html

CrossMappings.convergentcrossmapMethod
convergentcrossmap(source,
        target,
        timeseries_lengths;
        summarise::Bool = true,
        average_measure::Symbol = :median,
        uncertainty_measure::Symbol = :quantile,
        quantiles = [0.327, 0.673],
        kwargs...)

Algorithm

Compute the cross mapping between a source series and a target series over different timeseries_lengths. If summarise = true, then call ccm_with_summary. If summarise = false, then call ccm (returns raw crossmap skills).

Arguments

  • source: The data series representing the putative source process.
  • target: The data series representing the putative target process.
  • timeseries_lengths: Time series length(s) for which to compute the cross mapping(s).

Summary keyword arguments

  • summarise: Should cross map skills be summarised for each time series length? Default is summarise = true.
  • average_measure: Either :median or :mean. Default is :median.
  • uncertainty_measure: Either :quantile or :std. Default is :quantile.
  • quantiles: Compute uncertainty over quantile(s) if uncertainty_measure is :quantile. Default is [0.327, 0.673], roughly corresponding to 1s for normally distributed data.

Keyword arguments to crossmap

  • dim: The dimension of the state space reconstruction (delay embedding) constructed from the target series. Default is dim = 3.
  • τ: The embedding lag for the delay embedding constructed from target. Default is τ = 1.
  • η: The prediction lag to use when predicting scalar values of source fromthe delay embedding of target. η > 0 are forward lags (causal; source's past influences target's future), and η < 0 are backwards lags (non-causal; source's' future influences target's past). Adjust the prediction lag if you want to performed lagged ccm (Ye et al., 2015). Default is η = 0, as in Sugihara et al. (2012). Note: The sign of the lag η is organized to conform with the conventions in TransferEntropy.jl, and is opposite to the convention used in the rEDM package (Ye et al., 2016).
  • libsize: Among how many delay embedding points should we sample time indices and look for nearest neighbours at each cross mapping realization (of which there are n_reps)?
  • n_reps: The number of times we draw a library of libsize points from the delay embedding of target and try to predict source values. Equivalently, how many times do we cross map for this value of libsize? Default is n_reps = 100.
  • replace: Sample delay embedding points with replacement? Default is replace = true.
  • theiler_window: How many temporal neighbors of the delay embedding point target_embedding(t) to exclude when searching for neighbors to determine weights for predicting the scalar point source(t + η). Default is theiler_window = 0.
  • tree_type: The type of tree to build when looking for nearest neighbors. Must be a tree type from NearestNeighbors.jl. For now, this is either BruteTree, KDTree or BallTree.
  • distance_metric: An instance of a Metric from Distances.jl. BallTree and BruteTree work with any Metric. KDTree only works with the axis aligned metrics Euclidean, Chebyshev, Minkowski and Cityblock. Default is metric = Euclidean() (note the instantiation of the metric).
  • correspondence_measure: The function that computes the correspondence between actual values of source and predicted values. Can be any function returning a similarity measure between two vectors of values. Default is correspondence_measure = StatsBase.cor, which returns values on $[-1, 1]$. In this case, any negative values are usually filtered out (interpreted as zero coupling) and a value of $1$ means perfect prediction. Sugihara et al. (2012) also proposes to use the root mean square deviation, for which a value of $0$ would be perfect prediction.

References

Sugihara, George, et al. "Detecting causality in complex ecosystems." Science (2012): 1227079. http://science.sciencemag.org/content/early/2012/09/19/science.1227079

Ye, Hao, et al. "Distinguishing time-delayed causal interactions using convergent cross mapping." Scientific Reports 5 (2015): 14750. https://www.nature.com/articles/srep14750

Ye, H., et al. "rEDM: Applications of empirical dynamic modeling from time series." R Package Version 0.4 7 (2016). https://cran.r-project.org/web/packages/rEDM/index.html

CrossMappings.crossmapMethod
crossmap(source, target;
    dim::Int = 3,
    τ::Int = 1,
    libsize::Int = 10,
    replace::Bool = false,
    n_reps::Int = 100,
    theiler_window::Int = 0,
    tree_type = NearestNeighbors.KDTree,
    distance_metric = Distances.Euclidean(),
    correspondence_measure = StatsBase.cor,
    η::Int = 0)

Algorithm

Compute the cross mapping between a source series and a target series.

Arguments

  • source: The data series representing the putative source process.
  • target: The data series representing the putative target process.
  • dim: The dimension of the state space reconstruction (delay embedding) constructed from the target series. Default is dim = 3.
  • τ: The embedding lag for the delay embedding constructed from target. Default is τ = 1.
  • η: The prediction lag to use when predicting scalar values of source fromthe delay embedding of target. η > 0 are forward lags (causal; source's past influences target's future), and η < 0 are backwards lags (non-causal; source's' future influences target's past). Adjust the prediction lag if you want to performed lagged ccm (Ye et al., 2015). Default is η = 0, as in Sugihara et al. (2012). Note: The sign of the lag η is organized to conform with the conventions in TransferEntropy.jl, and is opposite to the convention used in the rEDM package (Ye et al., 2016).
  • libsize: Among how many delay embedding points should we sample time indices and look for nearest neighbours at each cross mapping realization (of which there are n_reps)?
  • n_reps: The number of times we draw a library of libsize points from the delay embedding of target and try to predict source values. Equivalently, how many times do we cross map for this value of libsize? Default is n_reps = 100.
  • replace: Sample delay embedding points with replacement? Default is replace = true.
  • theiler_window: How many temporal neighbors of the delay embedding point target_embedding(t) to exclude when searching for neighbors to determine weights for predicting the scalar point source(t + η). Default is theiler_window = 0.
  • tree_type: The type of tree to build when looking for nearest neighbors. Must be a tree type from NearestNeighbors.jl. For now, this is either BruteTree, KDTree or BallTree.
  • distance_metric: An instance of a Metric from Distances.jl. BallTree and BruteTree work with any Metric. KDTree only works with the axis aligned metrics Euclidean, Chebyshev, Minkowski and Cityblock. Default is metric = Euclidean() (note the instantiation of the metric).
  • correspondence_measure: The function that computes the correspondence between actual values of source and predicted values. Can be any function returning a similarity measure between two vectors of values. Default is correspondence_measure = StatsBase.cor, which returns values on $[-1, 1]$. In this case, any negative values are usually filtered out (interpreted as zero coupling) and a value of $1$ means perfect prediction. Sugihara et al. (2012) also proposes to use the root mean square deviation, for which a value of $0$ would be perfect prediction.

References

Sugihara, George, et al. "Detecting causality in complex ecosystems." Science (2012): 1227079. http://science.sciencemag.org/content/early/2012/09/19/science.1227079

Ye, Hao, et al. "Distinguishing time-delayed causal interactions using convergent cross mapping." Scientific Reports 5 (2015): 14750. https://www.nature.com/articles/srep14750

Ye, H., et al. "rEDM: Applications of empirical dynamic modeling from time series." R Package Version 0.4 7 (2016). https://cran.r-project.org/web/packages/rEDM/index.html

CrossMappings.predict_point!Method
predict_point!(predictions, i, source_values, u, w, dists, dim)

The prediction part of the convergent cross mapping algorithm.

Algorithm

Consider the point in the delay embedding of target point with time index i. Denote the time indices of its nearest neighbors $t_1, t_2, \ldots, t_{dim+1}$. Denote the scalar values of source at those time indices $y_1, y_2, \ldots, y_{dim+1}$.

Given distances $d_1, d_2, \ldots, d_{dim+1}$ from the i-th point to its nearest neighbors, we compute the weights w from the cross mapping algorithm (Sugihara et al, 2012; supplementary material, page 4). The weights w and coefficients u are stored in pre-allocated vectors.

A prediction for the observation with time index i in the source timeseries, call it $\hat{y}(i)$, is computed as the sum

\[\hat{y}(i) = \sum_{j=1}^{dim+1} w_j y_j.\]

We store the prediction $\hat{y}(i)$ in position i of the pre-allocated vector predictions.

Arguments

  • predictions: A pre-allocated vector in which to store the prediction for the scalar value of the source series.
  • i: The time index of the point of the source series being predicted. The prediction is stored in predictions[i].
  • source_values: Let $t_1, t_2, \ldots, t_{dim + 1}$ be the time indices of the nearest neighbors to the delay embedding point with time index i. source_values contains the scalar values of the source series at those time indices.
  • u: A pre-allocated vector of length dim + 1 that holds the normalisation coefficients for computing the weights in the cross mapping algorithm.
  • w: A pre-allocated vector of length dim + 1 that holds the computed weights for the cross mapping algorithm.
  • dists: The distances from delay embedding point with time index i to its dim + 1 nearest neighbors, in order of increasing distances.
  • dim: The dimension of the delay embedding.

References

Sugihara, George, et al. "Detecting causality in complex ecosystems." Science (2012): 1227079. http://science.sciencemag.org/content/early/2012/09/19/science.1227079