Estimators

We split the estimators into two broad categories, which we call Frequentist and Bayesian. We also have a few composite estimators that either take an averaging or resampling approach to estimation.

Frequentist Estimators

DiscreteEntropy.maximum_likelihoodFunction
maximum_likelihood(data::CountData)::Float64

Compute the maximum likelihood estimation of Shannon entropy of data in nats.

\[\hat{H}_{\tiny{ML}} = - \sum_{i=1}^K p_i \log(p_i)\]

or equivalently

\[\hat{H}_{\tiny{ML}} = \log(N) - \frac{1}{N} \sum_{i=1}^{K}h_i \log(h_i)\]

DiscreteEntropy.miller_madowFunction
miller_madow(data::CountData)

Compute the Miller Madow estimation of Shannon entropy, with a positive bias based on the total number of samples seen (N) and the support size (K).

\[\hat{H}_{\tiny{MM}} = \hat{H}_{\tiny{ML}} + \frac{K - 1}{2N}\]

DiscreteEntropy.schurmannFunction
schurmann(data::CountData, ξ::Float64 = ℯ^(-1/2))

Compute the Schurmann estimate of Shannon entropy of data in nats.

\[\hat{H}_{SHU} = \psi(N) - \frac{1}{N} \sum_{i=1}^{K} \, h_i \left( \psi(h_i) + (-1)^{h_i} ∫_0^{\frac{1}{\xi} - 1} \frac{t^{h_i}-1}{1+t}dt \right) \]

This is no one ideal value for $\xi$, however the paper suggests $e^{(-1/2)} \approx 0.6$

External Links

schurmann

DiscreteEntropy.schurmann_generalisedFunction
schurmann_generalised(data::CountVector, xis::XiVector{T}) where {T<:Real}

schurmann_generalised

\[\hat{H}_{\tiny{SHU}} = \psi(N) - \frac{1}{N} \sum_{i=1}^{K} \, h_i \left( \psi(h_i) + (-1)^{h_i} ∫_0^{\frac{1}{\xi_i} - 1} \frac{t^{h_i}-1}{1+t}dt \right) \]

Compute the generalised Schurmann entropy estimation, given a countvector data and a xivector xis, which must both be the same length.

schurmann_generalised(data::CountVector, xis::Distribution, scalar=false)

Computes the generalised Schurmann entropy estimation, given a countvector data and a vector of xi values.

DiscreteEntropy.chao_shenFunction
chao_shen(data::CountData)

Compute the Chao-Shen estimate of the Shannon entropy of data in nats.

\[\hat{H}_{CS} = - \sum_{i=i}^{K} \frac{\hat{p}_i^{CS} \log \hat{p}_i^{CS}}{1 - (1 - \hat{p}_i^{CS})}\]

where

\[\hat{p}_i^{CS} = (1 - \frac{1 - \hat{p}_i^{ML}}{N}) \hat{p}_i^{ML}\]

DiscreteEntropy.shrinkFunction
shrink(data::CountData)

Compute the Shrinkage, or James-Stein estimator of Shannon entropy for data in nats.

\[\hat{H}_{\tiny{SHR}} = - \sum_{i=1}^{K} \hat{p}_x^{\tiny{SHR}} \log(\hat{p}_x^{\tiny{SHR}})\]

where

\[\hat{p}_x^{\tiny{SHR}} = \lambda t_x + (1 - \lambda) \hat{p}_x^{\tiny{ML}}\]

and

\[\lambda = \frac{ 1 - \sum_{x=1}^{K} (\hat{p}_x^{\tiny{SHR}})^2}{(n-1) \sum_{x=1}^K (t_x - \hat{p}_x^{\tiny{ML}})^2}\]

with

\[t_x = 1 / K\]

Notes

Based on the implementation in the R package entropy

External Links

Entropy Inference and the James-Stein Estimator

DiscreteEntropy.chao_wang_jostFunction
chao_wang_jost(data::CountData)

Compute the Chao Wang Jost Shannon entropy estimate of data in nats.

\[\hat{H}_{\tiny{CWJ}} = \sum_{1 \leq h_i \leq N-1} \frac{h_i}{N} \left(\sum_{k=h_i}^{N-1} \frac{1}{k} \right) + \frac{f_1}{N} (1 - A)^{-N + 1} \left\{ - \log(A) - \sum_{r=1}^{N-1} \frac{1}{r} (1 - A)^r \right\}\]

with

\[A = \begin{cases} \frac{2 f_2}{(N-1) f_1 + 2 f_2} \, & \text{if} \, f_2 > 0 \\ \frac{2}{(N-1)(f_1 - 1) + 1} \, & \text{if} \, f_2 = 0, \; f_1 \neq 0 \\ 1, & \text{if} \, f_1 = f_2 = 0 \end{cases}\]

where $f_1$ is the number of singletons and $f_2$ the number of doubletons in data.

Notes

The algorithm is slightly modified port of that used in the entropart R library.

External Links

Entropy and the species accumulation curve: a novel entropy estimator via discovery rates of new species

Bayesian Estimators

DiscreteEntropy.bayesFunction
bayes(data::CountData, α::AbstractFloat; K=nothing)

Compute an estimate of Shannon entropy given data and a concentration parameter $α$. If K is not provided, then the observed support size in data is used.

\[\hat{H}_{\text{Bayes}} = - \sum_{k=1}^{K} \hat{p}_k^{\text{Bayes}} \; \log \hat{p}_k^{\text{Bayes}}\]

where

\[p_k^{\text{Bayes}} = \frac{K + α}{n + A}\]

and

\[A = \sum_{x=1}^{K} α_{x}\]

In addition to setting your own α, we have the following suggested choices

  1. jeffrey : α = 0.5
  2. laplace: α = 1.0
  3. schurmann_grassberger: α = 1 / K
  4. minimax: α = √{n} / K
DiscreteEntropy.minimaxFunction
 minimax(data::CountData; K=nothing)

Compute bayes estimate of entropy, with $α = √\frac{data.N}$ where K = data.K if K is nothing.

DiscreteEntropy.nsbFunction
nsb(data, K=data.K)

Returns the Bayesian estimate of Shannon entropy of data, using the Nemenman, Shafee, Bialek algorithm

\[\hat{H}^{\text{NSB}} = \frac{ \int_0^{\ln(K)} d\xi \, \rho(\xi, \textbf{n}) \langle H^m \rangle_{\beta (\xi)} } { \int_0^{\ln(K)} d\xi \, \rho(\xi\mid n)}\]

where

\[\rho(\xi \mid \textbf{n}) = \mathcal{P}(\beta (\xi)) \frac{ \Gamma(\kappa(\xi))}{\Gamma(N + \kappa(\xi))} \prod_{i=1}^K \frac{\Gamma(n_i + \beta(\xi))}{\Gamma(\beta(\xi))}\]

DiscreteEntropy.ansbFunction
ansb(data::CountData; undersampled::Float64=0.1)::Float64

Return the Asymptotic NSB estimation of the Shannon entropy of data in nats.

See Asymptotic NSB estimator (equations 11 and 12)

\[\hat{H}_{\tiny{ANSB}} = (C_\gamma - \log(2)) + 2 \log(N) - \psi(\Delta)\]

where $C_\gamma$ is Euler's Gamma ($\approx 0.57721...$), $\psi_0$ is the digamma function and $\Delta$ the number of coincidences in the data.

This is designed for the extremely undersampled regime (K ~ N) and diverges with N when well-sampled. ANSB requires that $N/K → 0$, which we set to be $N/K < 0.1$ by default

External Links

Asymptotic NSB estimator (equations 11 and 12)

Mixed Estimators

DiscreteEntropy.pertFunction
pert(data::CountData, estimator::Type{T}) where {T<:AbstractEstimator}
pert(data::CountData, e1::Type{T}, e2::Type{T}) where {T<:AbstractEstimator}

A Pert estimate of entropy, where

a = best estimate
b = most likely estimate
c = worst case estimate
H = \frac{a + 4b + c}{6}

where the default estimators are: a = maximum_likelihood, c = ANSB and $b$ is the most likely value = ChaoShen

DiscreteEntropy.jackknifeFunction
 jackknife(data::CountData, estimator::Type{T}; corrected=false) where {T<:AbstractEstimator}

Compute the jackknifed estimate of estimator on data.

DiscreteEntropy.bayesian_bootstrapFunction
 bayesian_bootstrap(samples::SampleVector, estimator::Type{T}, reps, seed, concentration) where {T<:AbstractEstimator}

Compute a bayesian bootstrap resampling of samples for estimation with estimator, where reps is number of resampling to perform, seed is the random seed and concentration is the concentration parameter for a Dirichlet distribution.

Note

Based on this [link](https://towardsdatascience.com/the-bayesian-bootstrap-6ca4a1d45148