Estimators

We split the estimators into two broad categories, which we call Frequentist and Bayesian. We also have a few composite estimators that either take an averaging or resampling approach to estimation.

DiscreteEntropy.AbstractEstimator — Type

AbstractEstimator

Supertype for NonParameterised and Parameterised entropy estimators.

DiscreteEntropy.NonParameterisedEstimator — Type

NonParameterisedEstimator

Type for NonParameterised entropy estimators.

DiscreteEntropy.ParameterisedEstimator — Type

ParameterisedEstimator

Type for Parameterised entropy estimators.

Frequentist Estimators

DiscreteEntropy.maximum_likelihood — Function

maximum_likelihood(data::CountData)::Float64

Compute the maximum likelihood estimation of Shannon entropy of data in nats.

\[\hat{H}_{\tiny{ML}} = - \sum_{i=1}^K p_i \log(p_i)\]

or equivalently

\[\hat{H}_{\tiny{ML}} = \log(N) - \frac{1}{N} \sum_{i=1}^{K}h_i \log(h_i)\]

DiscreteEntropy.jackknife_mle — Function

jackknife_mle(data::CountData; corrected=false)::Tuple{AbstractFloat, AbstractFloat}

Compute the jackknifed maximum_likelihood estimate of data and the variance of the jackknifing (not the variance of the estimator itself).

If corrected is true, then the variance is scaled with n-1, else it is scaled with n

External Links

Estimation of the size of a closed population when capture probabilities vary among animals

DiscreteEntropy.miller_madow — Function

miller_madow(data::CountData)

Compute the Miller Madow estimation of Shannon entropy, with a positive bias based on the total number of samples seen (N) and the support size (K).

\[\hat{H}_{\tiny{MM}} = \hat{H}_{\tiny{ML}} + \frac{K - 1}{2N}\]

DiscreteEntropy.grassberger — Function

grassberger(data::CountData)

Compute the Grassberger (1988) estimation of Shannon entropy of data in nats

\[\hat{H}_{\tiny{Gr88}} = \sum_i \frac{h_i}{H} \left(\log(N) - \psi(h_i) - \frac{(-1)^{h_i}}{n_i + 1} \right)\]

Equation 13 from Finite sample corrections to entropy and dimension estimate

DiscreteEntropy.schurmann — Function

schurmann(data::CountData, ξ::Float64 = ℯ^(-1/2))

Compute the Schurmann estimate of Shannon entropy of data in nats.

\[\hat{H}_{SHU} = \psi(N) - \frac{1}{N} \sum_{i=1}^{K} \, h_i \left( \psi(h_i) + (-1)^{h_i} ∫_0^{\frac{1}{\xi} - 1} \frac{t^{h_i}-1}{1+t}dt \right) \]

This is no one ideal value for $\xi$, however the paper suggests $e^{(-1/2)} \approx 0.6$

External Links

schurmann

DiscreteEntropy.schurmann_generalised — Function

schurmann_generalised(data::CountVector, xis::XiVector{T}) where {T<:Real}

schurmann_generalised

\[\hat{H}_{\tiny{SHU}} = \psi(N) - \frac{1}{N} \sum_{i=1}^{K} \, h_i \left( \psi(h_i) + (-1)^{h_i} ∫_0^{\frac{1}{\xi_i} - 1} \frac{t^{h_i}-1}{1+t}dt \right) \]

Compute the generalised Schurmann entropy estimation, given a countvector data and a xivector xis, which must both be the same length.

schurmann_generalised(data::CountVector, xis::Distribution, scalar=false)

Computes the generalised Schurmann entropy estimation, given a countvector data and a vector of xi values.

DiscreteEntropy.chao_shen — Function

chao_shen(data::CountData)

Compute the Chao-Shen estimate of the Shannon entropy of data in nats.

\[\hat{H}_{CS} = - \sum_{i=i}^{K} \frac{\hat{p}_i^{CS} \log \hat{p}_i^{CS}}{1 - (1 - \hat{p}_i^{CS})}\]

where

\[\hat{p}_i^{CS} = (1 - \frac{1 - \hat{p}_i^{ML}}{N}) \hat{p}_i^{ML}\]

DiscreteEntropy.zhang — Function

zhang(data::CountData)

Compute the Zhang estimate of the Shannon entropy of data in nats.

The recommended definition of Zhang's estimator is from Grabchak et al.

\[\hat{H}_Z = \sum_{i=1}^K \hat{p}_i \sum_{v=1}^{N - h_i} \frac{1}{v} ∏_{j=0}^{v-1} \left( 1 + \frac{1 - h_i}{N - 1 - j} \right)\]

The actual algorithm comes from Fast Calculation of entropy with Zhang's estimator by Lozano et al..

Exernal Links

Entropy estimation in turing's perspective

DiscreteEntropy.bonachela — Function

bonachela(data::CountData)

Compute the Bonachela estimator of the Shannon entropy of data in nats.

\[\hat{H}_{B} = \frac{1}{N+2} \sum_{i=1}^{K} \left( (h_i + 1) \sum_{j=n_i + 2}^{N+2} \frac{1}{j} \right)\]

External Links

Entropy estimates of small data sets

DiscreteEntropy.shrink — Function

shrink(data::CountData)

Compute the Shrinkage, or James-Stein estimator of Shannon entropy for data in nats.

\[\hat{H}_{\tiny{SHR}} = - \sum_{i=1}^{K} \hat{p}_x^{\tiny{SHR}} \log(\hat{p}_x^{\tiny{SHR}})\]

where

\[\hat{p}_x^{\tiny{SHR}} = \lambda t_x + (1 - \lambda) \hat{p}_x^{\tiny{ML}}\]

and

\[\lambda = \frac{ 1 - \sum_{x=1}^{K} (\hat{p}_x^{\tiny{SHR}})^2}{(n-1) \sum_{x=1}^K (t_x - \hat{p}_x^{\tiny{ML}})^2}\]

with

\[t_x = 1 / K\]

Notes

Based on the implementation in the R package entropy

External Links

Entropy Inference and the James-Stein Estimator

DiscreteEntropy.chao_wang_jost — Function

chao_wang_jost(data::CountData)

Compute the Chao Wang Jost Shannon entropy estimate of data in nats.

\[\hat{H}_{\tiny{CWJ}} = \sum_{1 \leq h_i \leq N-1} \frac{h_i}{N} \left(\sum_{k=h_i}^{N-1} \frac{1}{k} \right) + \frac{f_1}{N} (1 - A)^{-N + 1} \left\{ - \log(A) - \sum_{r=1}^{N-1} \frac{1}{r} (1 - A)^r \right\}\]

with

\[A = \begin{cases} \frac{2 f_2}{(N-1) f_1 + 2 f_2} \, & \text{if} \, f_2 > 0 \\ \frac{2}{(N-1)(f_1 - 1) + 1} \, & \text{if} \, f_2 = 0, \; f_1 \neq 0 \\ 1, & \text{if} \, f_1 = f_2 = 0 \end{cases}\]

where $f_1$ is the number of singletons and $f_2$ the number of doubletons in data.

Notes

The algorithm is slightly modified port of that used in the entropart R library.

External Links

Entropy and the species accumulation curve: a novel entropy estimator via discovery rates of new species

Bayesian Estimators

DiscreteEntropy.bayes — Function

bayes(data::CountData, α::AbstractFloat; K=nothing)

Compute an estimate of Shannon entropy given data and a concentration parameter $α$. If K is not provided, then the observed support size in data is used.

\[\hat{H}_{\text{Bayes}} = - \sum_{k=1}^{K} \hat{p}_k^{\text{Bayes}} \; \log \hat{p}_k^{\text{Bayes}}\]

where

\[p_k^{\text{Bayes}} = \frac{K + α}{n + A}\]

and

\[A = \sum_{x=1}^{K} α_{x}\]

In addition to setting your own α, we have the following suggested choices

jeffrey : α = 0.5
laplace: α = 1.0
schurmann_grassberger: α = 1 / K
minimax: α = √{n} / K

DiscreteEntropy.jeffrey — Function

 jeffrey(data::CountData; K=nothing)

Compute bayes estimate of entropy, with $α = 0.5$

DiscreteEntropy.laplace — Function

 laplace(data::CountData; K=nothing)

Compute bayes estimate of entropy, with $α = 1.0$

DiscreteEntropy.schurmann_grassberger — Function

 schurmann_grassberger(data::CountData; K=nothing)

Compute bayes estimate of entropy, with $α = \frac{1}{K}$. If K is nothing, then use data.K

DiscreteEntropy.minimax — Function

 minimax(data::CountData; K=nothing)

Compute bayes estimate of entropy, with $α = √\frac{data.N}$ where K = data.K if K is nothing.

DiscreteEntropy.nsb — Function

nsb(data, K=data.K)

Returns the Bayesian estimate of Shannon entropy of data, using the Nemenman, Shafee, Bialek algorithm

\[\hat{H}^{\text{NSB}} = \frac{ \int_0^{\ln(K)} d\xi \, \rho(\xi, \textbf{n}) \langle H^m \rangle_{\beta (\xi)} } { \int_0^{\ln(K)} d\xi \, \rho(\xi\mid n)}\]

where

\[\rho(\xi \mid \textbf{n}) = \mathcal{P}(\beta (\xi)) \frac{ \Gamma(\kappa(\xi))}{\Gamma(N + \kappa(\xi))} \prod_{i=1}^K \frac{\Gamma(n_i + \beta(\xi))}{\Gamma(\beta(\xi))}\]

DiscreteEntropy.ansb — Function

ansb(data::CountData; undersampled::Float64=0.1)::Float64

Return the Asymptotic NSB estimation of the Shannon entropy of data in nats.

See Asymptotic NSB estimator (equations 11 and 12)

\[\hat{H}_{\tiny{ANSB}} = (C_\gamma - \log(2)) + 2 \log(N) - \psi(\Delta)\]

where $C_\gamma$ is Euler's Gamma ($\approx 0.57721...$), $\psi_0$ is the digamma function and $\Delta$ the number of coincidences in the data.

This is designed for the extremely undersampled regime (K ~ N) and diverges with N when well-sampled. ANSB requires that $N/K → 0$, which we set to be $N/K < 0.1$ by default

External Links

Asymptotic NSB estimator (equations 11 and 12)

DiscreteEntropy.pym — Function

pym(mm::Vector{Int64}, icts::Vector{Int64})::Float64

A more or less faithful port of the original matlab code to Julia

Mixed Estimators

DiscreteEntropy.pert — Function

pert(data::CountData, estimator::Type{T}) where {T<:AbstractEstimator}
pert(data::CountData, e1::Type{T}, e2::Type{T}) where {T<:AbstractEstimator}

A Pert estimate of entropy, where

a = best estimate
b = most likely estimate
c = worst case estimate

H = \frac{a + 4b + c}{6}

where the default estimators are: a = maximum_likelihood, c = ANSB and $b$ is the most likely value = ChaoShen

DiscreteEntropy.jackknife — Function

 jackknife(data::CountData, estimator::Type{T}; corrected=false) where {T<:AbstractEstimator}

Compute the jackknifed estimate of estimator on data.

DiscreteEntropy.bayesian_bootstrap — Function

 bayesian_bootstrap(samples::SampleVector, estimator::Type{T}, reps, seed, concentration) where {T<:AbstractEstimator}

Compute a bayesian bootstrap resampling of samples for estimation with estimator, where reps is number of resampling to perform, seed is the random seed and concentration is the concentration parameter for a Dirichlet distribution.

Note

Based on this [link](https://towardsdatascience.com/the-bayesian-bootstrap-6ca4a1d45148