Kernel density

Entropies.NaiveKernelType
NaiveKernel(ϵ::Real, ss = KDTree; w = 0, metric = Euclidean()) <: ProbabilitiesEstimator

Estimate probabilities/entropy using a "naive" kernel density estimation approach (KDE), as discussed in Prichard and Theiler (1995) [PrichardTheiler1995].

Probabilities $P(\mathbf{x}, \epsilon)$ are assigned to every point $\mathbf{x}$ by counting how many other points occupy the space spanned by a hypersphere of radius ϵ around $\mathbf{x}$, according to:

\[P_i( X, \epsilon) \approx \dfrac{1}{N} \sum_{s} B(||X_i - X_j|| < \epsilon),\]

where $B$ gives 1 if the argument is true. Probabilities are then normalized.

The search structure ss is any search structure supported by Neighborhood.jl. Specifically, use KDTree to use a tree-based neighbor search, or BruteForce for the direct distances between all points. KDTrees heavily outperform direct distances when the dimensionality of the data is much smaller than the data length.

The keyword w stands for the Theiler window, and excludes indices $s$ that are within $|i - s| ≤ w$ from the given point $X_i$.

Distance evaluation methods

Missing docstring.

Missing docstring for TreeDistance. Check Documenter's build log for details.

Missing docstring.

Missing docstring for DirectDistance. Check Documenter's build log for details.

Example

Here, we draw some random points from a 2D normal distribution. Then, we use kernel density estimation to associate a probability to each point p, measured by how many points are within radius 1.5 of p. Plotting the actual points, along with their associated probabilities estimated by the KDE procedure, we get the following surface plot.

using Distributions, PyPlot, DelayEmbeddings, Entropies
𝒩 = MvNormal([1, -4], 2)
N = 500
D = Dataset(sort([rand(𝒩) for i = 1:N]))
x, y = columns(D)
p = probabilities(D, NaiveKernel(1.5))
surf(x, y, p.p)
xlabel("x"); ylabel("y")
savefig("kernel_surface.png")

  • PrichardTheiler1995Prichard, D., & Theiler, J. (1995). Generalized redundancies for time series analysis. Physica D: Nonlinear Phenomena, 84(3-4), 476-493.