# Non-exported functions

## Symbolization

`Entropies.symbolize`

— Function**Permutation symbolization**

```
symbolize(x::AbstractVector{T}, est::SymbolicPermutation) where {T} → Vector{Int}
symbolize!(s, x::AbstractVector{T}, est::SymbolicPermutation) where {T} → Vector{Int}
```

If `x`

is a univariate time series, first `x`

create a delay reconstruction of `x`

using embedding lag `est.τ`

and embedding dimension `est.m`

, then symbolizing the resulting state vectors with `encode_motif`

.

Optionally, the in-place `symbolize!`

can be used to put symbols in a pre-allocated integer vector `s`

, where `length(s) == length(x)-(est.m-1)*est.τ`

.

```
symbolize(x::AbstractDataset{m, T}, est::SymbolicPermutation) where {m, T} → Vector{Int}
symbolize!(s, x::AbstractDataset{m, T}, est::SymbolicPermutation) where {m, T} → Vector{Int}
```

If `x`

is an `m`

-dimensional dataset, then motif lengths are determined by the dimension of the input data, and `x`

is symbolized by converting each `m`

-dimensional state vector as a unique integer in the range $1, 2, \ldots, m-1$, using `encode_motif`

.

Optionally, the in-place `symbolize!`

can be used to put symbols in a pre-allocated integer vector `s`

, where `length(s) == length(x)`

.

**Examples**

Symbolize a 7-dimensional dataset. Motif lengths (or order of the permutations) are inferred to be 7.

```
using DelayEmbeddings, Entropies
D = Dataset([rand(7) for i = 1:1000])
s = symbolize(D, SymbolicPermutation())
```

Symbolize a univariate time series by first embedding it in dimension 5 with embedding lag 2. Motif lengths (or order of the permutations) are therefore 5.

```
using DelayEmbeddings, Entropies
n = 5000
x = rand(n)
s = symbolize(x, SymbolicPermutation(m = 5, τ = 2))
```

The integer vector `s`

now has length `n-(m-1)*τ = 4992`

, and each `s[i]`

contains the integer symbol for the ordinal pattern of state vector `x[i]`

.

**Gaussian symbolization**

`symbolize(x::AbstractVector, s::GaussianSymbolization)`

Map the elements of `x`

to a symbol time series according to the Gaussian symbolization scheme `s`

.

**Examples**

```
julia> x = [0.1, 0.4, 0.7, -2.1, 8.0, 0.9, -5.2];
julia> Entropies.symbolize(x, GaussianSymbolization(5))
7-element Vector{Int64}:
3
3
3
2
5
3
1
```

See also: `GaussianSymbolization`

.

`Entropies.encode_motif`

— Function`encode_motif(x, m::Int = length(x)) → s::Int`

Encode the length-`m`

motif `x`

(a vector of indices that would sort some vector `v`

in ascending order) into its unique integer symbol $s \in \{1, 2, \ldots, m - 1 \}$, using Algorithm 1 in Berger et al. (2019)^{[Berger2019]}.

**Example**

```
v = rand(5)
# The indices that would sort `v` in ascending order. This is now a permutation
# of the index permutation (1, 2, ..., 5)
x = sortperm(v)
# Encode this permutation as an integer.
encode_motif(x)
```

## Binning-related

`Entropies.encode_as_bin`

— Function`encode_as_bin(point, reference_point, edgelengths) → Vector{Int}`

Encode a point into its integer bin labels relative to some `reference_point`

(always counting from lowest to highest magnitudes), given a set of box `edgelengths`

(one for each axis). The first bin on the positive side of the reference point is indexed with 0, and the first bin on the negative side of the reference point is indexed with -1.

See also: `joint_visits`

, `marginal_visits`

.

**Example**

```
using Entropies
refpoint = [0, 0, 0]
steps = [0.2, 0.2, 0.3]
encode_as_bin(rand(3), refpoint, steps)
```

`Entropies.joint_visits`

— Function`joint_visits(points, binning_scheme::RectangularBinning) → Vector{Vector{Int}}`

Determine which bins are visited by `points`

given the rectangular binning scheme `ϵ`

. Bins are referenced relative to the axis minima, and are encoded as integers, such that each box in the binning is assigned a unique integer array (one element for each dimension).

For example, if a bin is visited three times, then the corresponding integer array will appear three times in the array returned.

See also: `marginal_visits`

, `encode_as_bin`

.

**Example**

```
using DelayEmbeddings, Entropies
pts = Dataset([rand(5) for i = 1:100]);
joint_visits(pts, RectangularBinning(0.2))
```

`Entropies.marginal_visits`

— Function`marginal_visits(points, binning_scheme::RectangularBinning, dims) → Vector{Vector{Int}}`

Determine which bins are visited by `points`

given the rectangular binning scheme `ϵ`

, but only along the desired dimensions `dims`

. Bins are referenced relative to the axis minima, and are encoded as integers, such that each box in the binning is assigned a unique integer array (one element for each dimension in `dims`

).

For example, if a bin is visited three times, then the corresponding integer array will appear three times in the array returned.

See also: `joint_visits`

, `encode_as_bin`

.

**Example**

```
using DelayEmbeddings, Entropies
pts = Dataset([rand(5) for i = 1:100]);
# Marginal visits along dimension 3 and 5
marginal_visits(pts, RectangularBinning(0.3), [3, 5])
# Marginal visits along dimension 2 through 5
marginal_visits(pts, RectangularBinning(0.3), 2:5)
```

`marginal_visits(joint_visits, dims) → Vector{Vector{Int}}`

If joint visits have been precomputed using `joint_visits`

, marginal visits can be returned directly without providing the binning again using the `marginal_visits(joint_visits, dims)`

signature.

See also: `joint_visits`

, `encode_as_bin`

.

**Example**

```
using DelayEmbeddings, Entropies
pts = Dataset([rand(5) for i = 1:100]);
# First compute joint visits, then marginal visits along dimensions 1 and 4
jv = joint_visits(pts, RectangularBinning(0.2))
marginal_visits(jv, [1, 4])
# Marginals along dimension 2
marginal_visits(jv, 2)
```

- Berger2019Berger, Sebastian, et al. "Teaching Ordinal Patterns to a Computer: Efficient Encoding Algorithms Based on the Lehmer Code." Entropy 21.10 (2019): 1023.
- Berger2019Berger, Sebastian, et al. "Teaching Ordinal Patterns to a Computer: Efficient Encoding Algorithms Based on the Lehmer Code." Entropy 21.10 (2019): 1023.