# Public Documentation

Documentation for CovarianceEstimation.jl's public interface.

See Internal Documentation for internal package docs.

## Public Interface

Statistics.covFunction
cov(lse::LinearShrinkage, X, [weights::FrequencyWeights]; dims=1, mean=nothing)

Linear shrinkage covariance estimator for matrix X along dimension dims. Computed using the method described by lse.

Optionally provide weights associated with each observation in X (see StatsBase.FrequencyWeights).

Note

Theoretical guidance for the use of weights in shrinkage estimation seems sparse. FrequencyWeights have a straightforward implementation, but support for other AbstractWeight subtypes awaits analytical justification.

cov(ans::AnalyticalNonlinearShrinkage, X, [weights]; dims=1, mean=nothing)

Nonlinear covariance estimator derived from the sample covariance estimator S and its eigenvalue decomposition (which can be given through decomp). See Ledoit and Wolf's paper http://www.econ.uzh.ch/static/wp/econwp264.pdf The keyword mean can be nothing (centering via estimated mean), zero (no centering) or a provided vector. In the first case, a rank-1 modification is applied and therefore the effective sample size is decreased by one (see analytical_nonlinear_shrinkage). In the latter two case the mean cannot have been estimated on the data (otherwise the effective sample size will be 1 larger than it should be resulting in numerical instabilities). If you are unsure, use either nothing or provide an explicit (non-estimated) vector (possibly a zero vector) and avoid the use of mean=0.

• Time complexity (including formation of S)
• (p<n): O(np^2 + n^2) with moderate constant
• (p>n): O(p^3) with low constant (dominated by eigendecomposition of S)
cov(estimator::WoodburyEstimator, X::AbstractMatrix, weights::FrequencyWeights...; dims::Int=1, mean=nothing, UsV=nothing)

Estimate the covariance matrix from the data matrix X using a "spiked" covariance model

Σ = σ²I + U * Λ * U',

where U is a low-rank matrix of eigenvectors and Λ is a diagonal matrix.

Reference: Donoho, D.L., Gavish, M. and Johnstone, I.M., 2018. Optimal shrinkage of eigenvalues in the spiked covariance model. Annals of statistics, 46(4), p.1742.

When σ² is not supplied in estimator, it is calculated from the residuals X - X̂, where is the low-rank approximation of X used to generate U and Λ.

If X is too large to manipulate in memory, you can pass UsV = (U, s, V) (a truncated SVD of X - mean(X; dims)) and then X will only be used compute the dimensionality and number of observations. This requires that you specify estimator.σ².

CovarianceEstimation.LinearShrinkageType
LinearShrinkage(target, shrinkage; corrected=false, drop_var0=false)

Linear shrinkage estimator described by equation $(1 - \lambda) S + \lambda F$ where $S$ is standard covariance matrix, $F$ is shrinkage target described by argument target and $\lambda$ is a shrinkage parameter, either given explicitly in shrinkage or automatically determined according to one of the supported methods.

The corrected estimator is used if corrected is true. drop_var0=true drops the zero-variance variables from the computation of \lambda.

CovarianceEstimation.DiagonalCommonVarianceType
DiagonalCommonVariance

Target for linear shrinkage: unit matrix multiplied by average variance of variables. A subtype of LinearShrinkageTarget where

• $F_{ij}=v$ if $i=j$ with $v=\mathrm{tr}(S)/p$ and
• $F_{ij}=0$ otherwise
CovarianceEstimation.DiagonalUnequalVarianceType
DiagonalUnequalVariance

Target for linear shrinkage: diagonal of covariance matrix. A subtype of LinearShrinkageTarget where

• $F_{ij}=s_{ij}$ if $i=j$ and
• $F_{ij}=0$ otherwise
CovarianceEstimation.CommonCovarianceType
CommonCovariance

Target for linear shrinkage: see target_C. A subtype of LinearShrinkageTarget where

• $F_{ij}=v$ if $i=j$ with $v=\mathrm{tr}(S)/p$ and
• $F_{ij}=c$ with $c=\sum_{i\neq j} S_{ij}/(p(p-1))$ otherwise
CovarianceEstimation.ConstantCorrelationType
ConstantCorrelation

Target for linear shrinkage: see target_F. A subtype of LinearShrinkageTarget where

• $F_{ij}=S_{ij}$ if $i=j$ and
• $F_{ij}=\overline{r}\sqrt{S_{ii}S_{jj}}$ otherwise where $\overline{r}$ is the average sample correlation
CovarianceEstimation.BiweightMidcovarianceType
BiweightMidcovariance(; c=9.0, modify_sample_size=false)

The biweight midcovariance is a covariance estimator that is resilient to outliers.

The technique derives originally from astrophysics [1], and is implemented in the Python module Astropy [2], as well as in NIST's Dataplot [3].

Consider two random variables $x$ and $y$, for which we have $n$ observations $\{(x_i, y_i)\}$. The biweight midcovariance is then defined to be:

$$$n_s\cdot\frac{ \sum_{|u_i|<1,|v_i|<1}(x_i - M_x)(1 - u_i^2)^2(y_i - M_y)(1 - v_i^2)^2 }{ \left(\sum_{|u_i|<1}(1 - u_i^2)(1-5u_i^2)\right) \left(\sum_{|v_i|<1}(1 - v_i^2)(1-5v_i^2)\right) }$$$

where $n_s$ is the sample size, $M_x$ and $M_y$ are the medians of $\{x_i\}$ and $\{y_i\}$ respectively, and

\begin{aligned} u_i &= \frac{x_i - M_x}{c \cdot \mathrm{MAD}_x} \\ v_i &= \frac{y_i - M_y}{c \cdot \mathrm{MAD}_y}, \end{aligned}

where $\mathrm{MAD}$ represents the median absolute deviation,

\begin{aligned} \mathrm{MAD}_x &= \mathrm{median}(\left\{\left|x_i - M_x\right|\right\}) \\ \mathrm{MAD}_y &= \mathrm{median}(\left\{\left|y_i - M_y\right|\right\}). \end{aligned}

If either $\mathrm{MAD}_x = 0$ or $\mathrm{MAD}_y = 0$, the pairwise covariance is defined to be zero.

The parameter $c$ is a tuning constant, for which the default is $9.0$. Larger values will reduce the number of outliers that are removed — i.e. reducing robustness, but increasing sample efficiency.

Fields

• c::Float64: The tuning constant corresponding to $c$ above.
• modify_sample_size::Bool: If false, then we use a sample size $n_s$ equal to the total number of observations $n$. This is consistent with the standard definition of biweight midcovariance in the literature. Otherwise, we count only those elements which are not rejected as outliers in the numerator, i.e. those for which $|u_i|<1$ and $|v_i|<1$. This follows the implementation in astropy [2].

Complexity

• Space: $O(p^2)$
• Time: $O(np^2)$

References

[1] Beers, Flynn, and Gebhardt (1990; AJ 100, 32) "Measures of Location and Scale for Velocities in Clusters of Galaxies – A Robust Approach"

CovarianceEstimation.NormLossCovType
NormLossCov(norm::Symbol, pivotidx::Int)

Specify a loss function for which the estimated covariance will be optimal. norm is one of :L1, :L2, or :Linf, and pivotidx is an integer from 1 to 7, as specified in Table 1 (p. 1755) of Donoho et al. (2018). In the table below, A and B are the target and sample covariances, respectively, and the loss function is the specified norm on the quantity in the pivot column:

pivotidxpivotNotes
1A - B
2A⁻¹ - B⁻¹
3A⁻¹ B - INot available for :L1
4B⁻¹ A - INot available for :L1
5A⁻¹ B + B⁻¹ A - 2INot supported
6sqrt(A) \ B / sqrt(A) - I
7log(sqrt(A) \ B / sqrt(A))Not supported

See also StatLossCov.

Reference: Donoho, D.L., Gavish, M. and Johnstone, I.M., 2018. Optimal shrinkage of eigenvalues in the spiked covariance model. Annals of statistics, 46(4), p.1742.

CovarianceEstimation.StatLossCovType
StatLossCov(mode::Symbol)

Specify a loss function for which the estimated covariance will be optimal. mode is one of :st, :ent, :div, :aff, or :fre, as specified in Table 2 (p. 1757) of Donoho et al. (2018). In the table below, A and B are the target and sample covariances, respectively:

modelossInterpretation
:stst(A, B) = tr(A⁻¹ B - I) - log(det(B)/det(A))Minimize `2 Dₖₗ(N(0, B)
:entst(B, A)Minimize errors in Mahalanobis distances
:divst(A, B) + st(B, A)
:aff0.5 * log(det(A + B) / (2 * sqrt(det(A*B))))Minimize Hellinger distance between N(0, A) and N(0, B)
:fretr(A + B - 2sqrt(A*B))
CovarianceEstimation.WoodburyEstimatorType
WoodburyEstimator(loss::LossFunction, rank::Integer;
σ²::Union{Real,Nothing}=nothing, corrected::Bool=false)

Specify that covariance matrices should be estimated using a "spiked" covariance model

Σ = σ²I + U * Λ * U'

loss is either a NormLossCov or StatLossCov object, which specifies the loss function for which the estimated covariance will be optimal. rank is the maximum number of eigenvalues Λ to retain in the model. Optionally, one may specify σ² directly, or it can be estimated from the data matrix (σ²=nothing). Set corrected=true to use the unbiased estimator of the variance.