# Public Documentation

Documentation for `CovarianceEstimation.jl`

's public interface.

See Internal Documentation for internal package docs.

## Contents

## Index

`CovarianceEstimation.AnalyticalNonlinearShrinkage`

`CovarianceEstimation.BiweightMidcovariance`

`CovarianceEstimation.CommonCovariance`

`CovarianceEstimation.ConstantCorrelation`

`CovarianceEstimation.DiagonalCommonVariance`

`CovarianceEstimation.DiagonalUnequalVariance`

`CovarianceEstimation.DiagonalUnitVariance`

`CovarianceEstimation.LinearShrinkage`

`CovarianceEstimation.NormLossCov`

`CovarianceEstimation.PerfectPositiveCorrelation`

`CovarianceEstimation.StatLossCov`

`CovarianceEstimation.WoodburyEstimator`

`Statistics.cov`

## Public Interface

`Statistics.cov`

— Function`cov(lse::LinearShrinkage, X, [weights::FrequencyWeights]; dims=1, mean=nothing)`

Linear shrinkage covariance estimator for matrix `X`

along dimension `dims`

. Computed using the method described by `lse`

.

Optionally provide `weights`

associated with each observation in `X`

(see `StatsBase.FrequencyWeights`

).

Theoretical guidance for the use of weights in shrinkage estimation seems sparse. `FrequencyWeights`

have a straightforward implementation, but support for other `AbstractWeight`

subtypes awaits analytical justification.

`cov(ans::AnalyticalNonlinearShrinkage, X, [weights]; dims=1, mean=nothing)`

Nonlinear covariance estimator derived from the sample covariance estimator `S`

and its eigenvalue decomposition (which can be given through `decomp`

). See Ledoit and Wolf's paper http://www.econ.uzh.ch/static/wp/econwp264.pdf The keyword `mean`

can be `nothing`

(centering via estimated mean), zero (no centering) or a provided vector. In the first case, a rank-1 modification is applied and therefore the effective sample size is decreased by one (see `analytical_nonlinear_shrinkage`

). In the latter two case the mean cannot have been estimated on the data (otherwise the effective sample size will be 1 larger than it should be resulting in numerical instabilities). If you are unsure, use either `nothing`

or provide an explicit (non-estimated) vector (possibly a zero vector) and avoid the use of `mean=0`

.

- Time complexity (including formation of
`S`

)- (p<n): O(np^2 + n^2) with moderate constant
- (p>n): O(p^3) with low constant (dominated by eigendecomposition of S)

`cov(estimator::WoodburyEstimator, X::AbstractMatrix, weights::FrequencyWeights...; dims::Int=1, mean=nothing, UsV=nothing)`

Estimate the covariance matrix from the data matrix `X`

using a "spiked" covariance model

`Σ = σ²I + U * Λ * U',`

where `U`

is a low-rank matrix of eigenvectors and `Λ`

is a diagonal matrix.

Reference: Donoho, D.L., Gavish, M. and Johnstone, I.M., 2018. Optimal shrinkage of eigenvalues in the spiked covariance model. Annals of statistics, 46(4), p.1742.

When `σ²`

is not supplied in `estimator`

, it is calculated from the residuals `X - X̂`

, where `X̂`

is the low-rank approximation of `X`

used to generate `U`

and `Λ`

.

If `X`

is too large to manipulate in memory, you can pass `UsV = (U, s, V)`

(a truncated SVD of `X - mean(X; dims)`

) and then `X`

will only be used compute the dimensionality and number of observations. This requires that you specify `estimator.σ²`

.

`CovarianceEstimation.LinearShrinkage`

— Type`LinearShrinkage(target, shrinkage; corrected=false, drop_var0=false)`

Linear shrinkage estimator described by equation $(1 - \lambda) S + \lambda F$ where $S$ is standard covariance matrix, $F$ is shrinkage target described by argument `target`

and $\lambda$ is a shrinkage parameter, either given explicitly in `shrinkage`

or automatically determined according to one of the supported methods.

The corrected estimator is used if `corrected`

is true. `drop_var0=true`

drops the zero-variance variables from the computation of `\lambda`

.

`CovarianceEstimation.DiagonalUnitVariance`

— Type`DiagonalUnitVariance`

Target for linear shrinkage: unit matrix. A subtype of `LinearShrinkageTarget`

where

- $F_{ij}=1$ if $i=j$ and
- $F_{ij}=0$ otherwise

`CovarianceEstimation.DiagonalCommonVariance`

— Type`DiagonalCommonVariance`

Target for linear shrinkage: unit matrix multiplied by average variance of variables. A subtype of `LinearShrinkageTarget`

where

- $F_{ij}=v$ if $i=j$ with $v=\mathrm{tr}(S)/p$ and
- $F_{ij}=0$ otherwise

`CovarianceEstimation.DiagonalUnequalVariance`

— Type`DiagonalUnequalVariance`

Target for linear shrinkage: diagonal of covariance matrix. A subtype of `LinearShrinkageTarget`

where

- $F_{ij}=s_{ij}$ if $i=j$ and
- $F_{ij}=0$ otherwise

`CovarianceEstimation.CommonCovariance`

— Type`CommonCovariance`

Target for linear shrinkage: see `target_C`

. A subtype of `LinearShrinkageTarget`

where

- $F_{ij}=v$ if $i=j$ with $v=\mathrm{tr}(S)/p$ and
- $F_{ij}=c$ with $c=\sum_{i\neq j} S_{ij}/(p(p-1))$ otherwise

`CovarianceEstimation.PerfectPositiveCorrelation`

— Type`PerfectPositiveCorrelation`

Target for linear shrinkage: see `target_E`

. A subtype of `LinearShrinkageTarget`

where

- $F_{ij}=S_{ij}$ if $i=j$ and
- $F_{ij}=\sqrt{S_{ii}S_{jj}}$ otherwise

`CovarianceEstimation.ConstantCorrelation`

— Type`ConstantCorrelation`

Target for linear shrinkage: see `target_F`

. A subtype of `LinearShrinkageTarget`

where

- $F_{ij}=S_{ij}$ if $i=j$ and
- $F_{ij}=\overline{r}\sqrt{S_{ii}S_{jj}}$ otherwise where $\overline{r}$ is the average sample correlation

`CovarianceEstimation.AnalyticalNonlinearShrinkage`

— Type`AnalyticalNonlinearShrinkage`

Analytical nonlinear shrinkage estimator. See docs for `analytical_nonlinear_shrinkage`

for details.

`CovarianceEstimation.BiweightMidcovariance`

— Type`BiweightMidcovariance(; c=9.0, modify_sample_size=false)`

The biweight midcovariance is a covariance estimator that is resilient to outliers.

The technique derives originally from astrophysics [1], and is implemented in the Python module Astropy [2], as well as in NIST's Dataplot [3].

Consider two random variables $x$ and $y$, for which we have $n$ observations $\{(x_i, y_i)\}$. The biweight midcovariance is then defined to be:

\[n_s\cdot\frac{ \sum_{|u_i|<1,|v_i|<1}(x_i - M_x)(1 - u_i^2)^2(y_i - M_y)(1 - v_i^2)^2 }{ \left(\sum_{|u_i|<1}(1 - u_i^2)(1-5u_i^2)\right) \left(\sum_{|v_i|<1}(1 - v_i^2)(1-5v_i^2)\right) }\]

where $n_s$ is the sample size, $M_x$ and $M_y$ are the medians of $\{x_i\}$ and $\{y_i\}$ respectively, and

\[\begin{aligned} u_i &= \frac{x_i - M_x}{c \cdot \mathrm{MAD}_x} \\ v_i &= \frac{y_i - M_y}{c \cdot \mathrm{MAD}_y}, \end{aligned}\]

where $\mathrm{MAD}$ represents the median absolute deviation,

\[\begin{aligned} \mathrm{MAD}_x &= \mathrm{median}(\left\{\left|x_i - M_x\right|\right\}) \\ \mathrm{MAD}_y &= \mathrm{median}(\left\{\left|y_i - M_y\right|\right\}). \end{aligned}\]

If either $\mathrm{MAD}_x = 0$ or $\mathrm{MAD}_y = 0$, the pairwise covariance is defined to be zero.

The parameter $c$ is a tuning constant, for which the default is $9.0$. Larger values will reduce the number of outliers that are removed — i.e. reducing robustness, but increasing sample efficiency.

**Fields**

`c::Float64`

: The tuning constant corresponding to $c$ above.`modify_sample_size::Bool`

: If`false`

, then we use a sample size $n_s$ equal to the total number of observations $n$. This is consistent with the standard definition of biweight midcovariance in the literature. Otherwise, we count only those elements which are not rejected as outliers in the numerator, i.e. those for which $|u_i|<1$ and $|v_i|<1$. This follows the implementation in astropy [2].

**Complexity**

- Space: $O(p^2)$
- Time: $O(np^2)$

**References**

[1] Beers, Flynn, and Gebhardt (1990; AJ 100, 32) "Measures of Location and Scale for Velocities in Clusters of Galaxies – A Robust Approach"

`CovarianceEstimation.NormLossCov`

— Type`NormLossCov(norm::Symbol, pivotidx::Int)`

Specify a loss function for which the estimated covariance will be optimal. `norm`

is one of `:L1`

, `:L2`

, or `:Linf`

, and `pivotidx`

is an integer from 1 to 7, as specified in Table 1 (p. 1755) of Donoho et al. (2018). In the table below, `A`

and `B`

are the target and sample covariances, respectively, and the loss function is the specified norm on the quantity in the `pivot`

column:

`pivotidx` | `pivot` | Notes |
---|---|---|

1 | `A - B` | |

2 | `A⁻¹ - B⁻¹` | |

3 | `A⁻¹ B - I` | Not available for `:L1` |

4 | `B⁻¹ A - I` | Not available for `:L1` |

5 | `A⁻¹ B + B⁻¹ A - 2I` | Not supported |

6 | `sqrt(A) \ B / sqrt(A) - I` | |

7 | `log(sqrt(A) \ B / sqrt(A))` | Not supported |

See also `StatLossCov`

.

Reference: Donoho, D.L., Gavish, M. and Johnstone, I.M., 2018. Optimal shrinkage of eigenvalues in the spiked covariance model. Annals of statistics, 46(4), p.1742.

`CovarianceEstimation.StatLossCov`

— Type`StatLossCov(mode::Symbol)`

Specify a loss function for which the estimated covariance will be optimal. `mode`

is one of `:st`

, `:ent`

, `:div`

, `:aff`

, or `:fre`

, as specified in Table 2 (p. 1757) of Donoho et al. (2018). In the table below, `A`

and `B`

are the target and sample covariances, respectively:

`mode` | loss | Interpretation |
---|---|---|

`:st` | `st(A, B) = tr(A⁻¹ B - I) - log(det(B)/det(A))` | Minimize `2 Dₖₗ(N(0, B) |

`:ent` | `st(B, A)` | Minimize errors in Mahalanobis distances |

`:div` | `st(A, B) + st(B, A)` | |

`:aff` | `0.5 * log(det(A + B) / (2 * sqrt(det(A*B))))` | Minimize Hellinger distance between `N(0, A)` and `N(0, B)` |

`:fre` | `tr(A + B - 2sqrt(A*B))` |

`CovarianceEstimation.WoodburyEstimator`

— Type```
WoodburyEstimator(loss::LossFunction, rank::Integer;
σ²::Union{Real,Nothing}=nothing, corrected::Bool=false)
```

Specify that covariance matrices should be estimated using a "spiked" covariance model

`Σ = σ²I + U * Λ * U'`

`loss`

is either a `NormLossCov`

or `StatLossCov`

object, which specifies the loss function for which the estimated covariance will be optimal. `rank`

is the maximum number of eigenvalues `Λ`

to retain in the model. Optionally, one may specify `σ²`

directly, or it can be estimated from the data matrix (`σ²=nothing`

). Set `corrected=true`

to use the unbiased estimator of the variance.