# Non-exported functions

## Symbolization

Entropies.symbolizeFunction

Permutation symbolization

symbolize(x::AbstractVector{T}, est::SymbolicPermutation) where {T} → Vector{Int}
symbolize!(s, x::AbstractVector{T}, est::SymbolicPermutation) where {T} → Vector{Int}

If x is a univariate time series, first x create a delay reconstruction of x using embedding lag est.τ and embedding dimension est.m, then symbolizing the resulting state vectors with encode_motif.

Optionally, the in-place symbolize! can be used to put symbols in a pre-allocated integer vector s, where length(s) == length(x)-(est.m-1)*est.τ.

symbolize(x::AbstractDataset{m, T}, est::SymbolicPermutation) where {m, T} → Vector{Int}
symbolize!(s, x::AbstractDataset{m, T}, est::SymbolicPermutation) where {m, T} → Vector{Int}

If x is an m-dimensional dataset, then motif lengths are determined by the dimension of the input data, and x is symbolized by converting each m-dimensional state vector as a unique integer in the range $1, 2, \ldots, m-1$, using encode_motif.

Optionally, the in-place symbolize! can be used to put symbols in a pre-allocated integer vector s, where length(s) == length(x).

Examples

Symbolize a 7-dimensional dataset. Motif lengths (or order of the permutations) are inferred to be 7.

using DelayEmbeddings, Entropies
D = Dataset([rand(7) for i = 1:1000])
s = symbolize(D, SymbolicPermutation())

Symbolize a univariate time series by first embedding it in dimension 5 with embedding lag 2. Motif lengths (or order of the permutations) are therefore 5.

using DelayEmbeddings, Entropies
n = 5000
x = rand(n)
s = symbolize(x, SymbolicPermutation(m = 5, τ = 2))

The integer vector s now has length n-(m-1)*τ = 4992, and each s[i] contains the integer symbol for the ordinal pattern of state vector x[i].

Gaussian symbolization

symbolize(x::AbstractVector, s::GaussianSymbolization)

Map the elements of x to a symbol time series according to the Gaussian symbolization scheme s.

Examples

julia> x = [0.1, 0.4, 0.7, -2.1, 8.0, 0.9, -5.2];

julia> Entropies.symbolize(x, GaussianSymbolization(5))
7-element Vector{Int64}:
3
3
3
2
5
3
1

See also: GaussianSymbolization.

Entropies.encode_motifFunction
encode_motif(x, m::Int = length(x)) → s::Int

Encode the length-m motif x (a vector of indices that would sort some vector v in ascending order) into its unique integer symbol $s \in \{1, 2, \ldots, m - 1 \}$, using Algorithm 1 in Berger et al. (2019)[Berger2019].

Example

v = rand(5)

# The indices that would sort v in ascending order. This is now a permutation
# of the index permutation (1, 2, ..., 5)
x = sortperm(v)

# Encode this permutation as an integer.
encode_motif(x)
Entropies.encode_as_binFunction
encode_as_bin(point, reference_point, edgelengths) → Vector{Int}

Encode a point into its integer bin labels relative to some reference_point (always counting from lowest to highest magnitudes), given a set of box edgelengths (one for each axis). The first bin on the positive side of the reference point is indexed with 0, and the first bin on the negative side of the reference point is indexed with -1.

Example

using Entropies

refpoint = [0, 0, 0]
steps = [0.2, 0.2, 0.3]
encode_as_bin(rand(3), refpoint, steps)
Entropies.joint_visitsFunction
joint_visits(points, binning_scheme::RectangularBinning) → Vector{Vector{Int}}

Determine which bins are visited by points given the rectangular binning scheme ϵ. Bins are referenced relative to the axis minima, and are encoded as integers, such that each box in the binning is assigned a unique integer array (one element for each dimension).

For example, if a bin is visited three times, then the corresponding integer array will appear three times in the array returned.

Example

using DelayEmbeddings, Entropies

pts = Dataset([rand(5) for i = 1:100]);
joint_visits(pts, RectangularBinning(0.2))
Entropies.marginal_visitsFunction
marginal_visits(points, binning_scheme::RectangularBinning, dims) → Vector{Vector{Int}}

Determine which bins are visited by points given the rectangular binning scheme ϵ, but only along the desired dimensions dims. Bins are referenced relative to the axis minima, and are encoded as integers, such that each box in the binning is assigned a unique integer array (one element for each dimension in dims).

For example, if a bin is visited three times, then the corresponding integer array will appear three times in the array returned.

Example

using DelayEmbeddings, Entropies
pts = Dataset([rand(5) for i = 1:100]);

# Marginal visits along dimension 3 and 5
marginal_visits(pts, RectangularBinning(0.3), [3, 5])

# Marginal visits along dimension 2 through 5
marginal_visits(pts, RectangularBinning(0.3), 2:5)
marginal_visits(joint_visits, dims) → Vector{Vector{Int}}

If joint visits have been precomputed using joint_visits, marginal visits can be returned directly without providing the binning again using the marginal_visits(joint_visits, dims) signature.

Example

using DelayEmbeddings, Entropies
pts = Dataset([rand(5) for i = 1:100]);

# First compute joint visits, then marginal visits along dimensions 1 and 4
jv = joint_visits(pts, RectangularBinning(0.2))
marginal_visits(jv, [1, 4])

# Marginals along dimension 2
marginal_visits(jv, 2)
• Berger2019Berger, Sebastian, et al. "Teaching Ordinal Patterns to a Computer: Efficient Encoding Algorithms Based on the Lehmer Code." Entropy 21.10 (2019): 1023.
• Berger2019Berger, Sebastian, et al. "Teaching Ordinal Patterns to a Computer: Efficient Encoding Algorithms Based on the Lehmer Code." Entropy 21.10 (2019): 1023.