riemannianGeometry.jl

This is the fundamental unit of PosDefManifold. It contains functions for manipulating points in the Riemannian manifold of Symmetric Positive Definite (SPD) or Hermitian Positive Definite (HPD) matrices. In Julia those are Hermitian matrices, see typecasting matrices.

The functions are divided in six categories:

CategoryOutput
1. Geodesic equationsinterpolation, extrapolation, weighted mean of two matrices, ...
2. Distanceslength of geodesics
3. Graphs and Laplaciansinter-distance matrices, spectral embedding, eigenmaps, ...
4. Meansmid-points of geodesics, Fréchet means of several points, midrange,...
5. Tangent Space operationsmaps from the manifold to the tangent space and viceversa, parallel transport,...
6. Procrustes problemsdata matching, transfer learning (domain adaptation), ...

Geodesic equations

FunctionDescription
geodesicGeodesic equations (weighted mean of two positive definite matrices) for any metric

PosDefManifold.geodesicFunction
(1) geodesic(metric::Metric, P::ℍ{T}, Q::ℍ{T}, a::Real) where T<:RealOrComplex
(2) geodesic(metric::Metric, D::𝔻{S}, E::𝔻{S}, a::Real) where S<:Real

(1) Move along the geodesic from point $P$ to point $Q$ (two positive definite matrices) with arclegth$0<=a<=1$, using the specified metric, of type Metric::Enumerated type.

For all metrics,

  • with $a=0$ we stay at $P$,
  • with $a=1$ we move up to $Q$,
  • with $a=1/2$ we move to the mid-point of $P$ and $Q$ (mean).

Using the Fisher metric, argument $a$ can be any real number, for instance:

  • with $0<a<1$ we move toward $Q$ (attraction),
  • with $a>1$ we move over and beyond $Q$ (extrapolation),
  • with $a<0$ we move back away from Q (repulsion).

$P$ and $Q$ must be flagged by julia as Hermitian. See typecasting matrices.

The Fisher geodesic move is computed by the Cholesky-Schur algorithm given in Eq. 4.2 by Iannazzo(2016)🎓. If $Q=I$, the Fisher geodesic move is simply $P^a$ (no need to call this funtion).

Nota Bene

For the logdet zero and Jeffrey metric no closed form expression for the geodesic is available to the best of authors' knowledge, so in this case the geodesic is found as the weighted mean using the mean function. For the Von Neumann not even an expression for the mean is available, so in this case the geodesic is not provided and a warning is printed.

(2) Like in (1), but for two real positive definite diagonal matrices $D$ and $E$.

Maths

For points $P$, $Q$ and arclength $a$, letting $b=1-a$, the geodesic equations for the supported metrics are:

Metricgeodesic equation
Euclidean$bP + aQ$
invEuclidean$\big(bP^{-1} + aQ^{-1}\big)^{-1}$
ChoEuclidean$TT^*$, where $T=bL_P + aL_Q$
logEuclidean$\text{exp}\big(b\hspace{2pt}\text{log}(P) + a\hspace{2pt}\text{log}(Q)\big)$
logCholesky$TT^*$, where $T=S_P+a(S_Q-S_P)+D_P\hspace{2pt}\text{exp}\big(a(\text{log}D_Q-\text{log}D_P)\big)$
Fisher$P^{1/2} \big(P^{-1/2} Q P^{-1/2}\big)^a P^{1/2}$
logdet0uses weighted mean algorithm logdet0Mean
Jeffreyuses weighted mean mean
VonNeumannN.A.
Wasserstein$b^2P+a^2Q +ab\big[(PQ)^{1/2} +(QP)^{1/2}\big]$

legend:$L_X$, $S_X$ and $D_X$ are the Cholesky lower triangle of $X$, its strictly lower triangular part and diagonal part, respectively (hence, $S_X+D_X=L_X$, $L_XL_X^*=X$).

See also: mean.

Examples

using PosDefManifold
P=randP(10)
Q=randP(10)
# Wasserstein mean
M=geodesic(Wasserstein, P, Q, 0.5)
# extrapolate suing the Fisher metric
E=geodesic(Fisher, P, Q, 2)

Distances

FunctionDescription
distanceSqr, distance²Squared distance between positive definite matrices
distanceDistance between positive definite matrices

PosDefManifold.distanceSqrFunction
(1) distanceSqr(metric::Metric, P::ℍ{T}) where T<:RealOrComplex
(2) distanceSqr(metric::Metric, P::ℍ{T}, Q::ℍ{T}) where T<:RealOrComplex
(3) distanceSqr(metric::Metric, D::𝔻{S}) where S<:Real
(4) distanceSqr(metric::Metric, D::𝔻{S}, E::𝔻{S}) where S<:Real

alias: distance²

(1) Return $δ^2(P, I)$, the square of the distance (or divergence) of positive definite matrix $P$ from the the identity matrix. See distance from the origin.

(2) Return $δ^2(P, Q)$, the square of the distance (or divergence) between two positive definite matrices $P$ and $Q$. See distance.

In both cases the distance function $δ$ is induced by the argument metric of type Metric::Enumerated type.

$P$ in (1) and $P$, $Q$ in (2) must be flagged by julia as Hermitian. See typecasting matrices.

(3) and (4) are specialized methods of (1) and (2), respectively, for real positive definite Diagonal matrices. See ℍVector type and 𝔻Vector type.

Maths

For point $P$ the squared distances from the identity for the supported metrics are:

MetricSquared Distance from the identity
Euclidean$∥P-I∥^2$
invEuclidean$∥P^{-1}-I∥^2$
ChoEuclidean$∥L_P-I∥^2$
logEuclidean$∥\textrm{log}P∥^2$
logCholesky$∥S_P∥^2+∥\textrm{log}D_P∥^2$
Fisher$∥\textrm{log}P∥^2$
logdet0$\textrm{logdet}\frac{1}{2}(P+I) - \frac{1}{2}\textrm{logdet}(P)$
Jeffrey$\frac{1}{2}\textrm{tr}(P+P^{-1})-n$
VonNeumann$\frac{1}{2}\textrm{tr}(P\textrm{log}P-\textrm{log}P)$
Wasserstein$\textrm{tr}(P+I) -2\textrm{tr}(P^{1/2})$

For points $P$ and $Q$ their squared distances for the supported metrics are:

MetricSquared Distance
Euclidean$∥P-Q∥^2$
invEuclidean$∥P^{-1}-Q^{-1}∥^2$
ChoEuclidean$∥ L_P - L_Q ∥^2$
logEuclidean$∥\textrm{log}P-\textrm{log}Q∥^2$
logCholesky$∥S_P-S_Q∥^2+∥\textrm{log}D_P-\textrm{log}D_Q∥^2$
Fisher$∥\textrm{log}(P^{-1/2}QP^{-1/2})∥^2$
logdet0$\textrm{logdet}\frac{1}{2}(P+Q) - \frac{1}{2}\textrm{logdet}(PQ)$
Jeffrey$\frac{1}{2}\textrm{tr}(Q^{-1}P+P^{-1}Q)-n$
VonNeumann$\frac{1}{2}\textrm{tr}(P\textrm{log}P-P\textrm{log}Q+Q\textrm{log}Q-Q\textrm{log}P)$
Wasserstein$\textrm{tr}(P+Q) -2\textrm{tr}(P^{1/2}QP^{1/2})^{1/2}$

legend:$L_X$, $S_X$ and $D_X$ are the Cholesky lower triangle of $X$, its strictly lower triangular part and diagonal part, respectively (hence, $S_X+D_X=L_X$, $L_XL_X^*=X$).

See also: distanceSqrMat.

Examples (1)

using PosDefManifold
P=randP(10)
d=distanceSqr(Wasserstein, P)
e=distanceSqr(Fisher, P)
metric=Metric(Int(logdet0)) # or metric=logdet0
s=string(metric) # check what is the current metric
f=distance²(metric, P) #using the alias distance²

Examples (2)

using PosDefManifold
P=randP(10)
Q=randP(10)
d=distanceSqr(logEuclidean, P, Q)
e=distance²(Jeffrey, P, Q)
PosDefManifold.distanceFunction
(1) distance(metric::Metric, P::ℍ{T}) where T<:RealOrComplex
(2) distance(metric::Metric, P::ℍ{T}, Q::ℍ{T}) where T<:RealOrComplex
(3) distance(metric::Metric, D::𝔻{S}) where S<:Real
(4) distance(metric::Metric, D::𝔻{S}, E::𝔻{S}) where S<:Real

(1) Return $δ(P, I)$, the distance between positive definite matrix $P$ and the identity matrix.

(2) Return $δ(P, Q)$, the distance between positive definite matrices $P$ and $Q$.

(3) and (4) are specialized methods of (1) and (2), respectively, for real positive definite Diagonal matrices.

This is the square root of distanceSqr and is invoked with the same syntax therein.

See also: distanceMat.

Graphs and Laplacians

FunctionDescription
distanceSqrMat, distance²MatLower triangular matrix of all squared inter-distances
distanceMatLower triangular matrix of all inter-distances
laplacianLaplacian of a squared inter-distances matrix
laplacianEigenMaps, laplacianEMEigen maps (eigenvectors) of a Laplacian
spectralEmbedding, spEmbSpectral Embedding (the above functions run in series)

PosDefManifold.distanceSqrMatFunction
    (1) distanceSqrMat(metric::Metric, 𝐏::ℍVector;
    <⏩=true>)

    (2) distanceSqrMat(type::Type{T}, metric::Metric, 𝐏::ℍVector;
    <⏩=true>) where T<:AbstractFloat

alias: distance²Mat

Given a 1d array $𝐏$ of $k$ positive definite matrices ${P_1,...,P_k}$ of ℍVector type, create the $k⋅k$ real LowerTriangular matrix comprising elements $δ^2(P_i, P_j)\textrm{, for all }i>=j$.

This is the lower triangular matrix holding all squared inter-distances (zero on diagonal), using the specified metric, of type Metric::Enumerated type, giving rise to distance function $δ$. See distanceSqr.

Only the lower triangular part is computed in order to optimize memory use.

By default, the result matrix is of type Float32. The type can be changed to another real type using method (2).

<optional keyword arguments>:

  • if ⏩=true (default) the computation of inter-distances is multi-threaded.
Nota Bene

Multi-threading is automatically disabled if Julia is instructed to use only one thread. See Threads.

See: distance.

See also: laplacian, laplacianEigenMaps, spectralEmbedding.

Examples

using PosDefManifold
# Generate a set of 8 random 10x10 SPD matrices
Pset=randP(10, 8) # or, using unicode: 𝐏=randP(10, 8)
# Compute the squared inter-distance matrix according to the log Euclidean metric.
# This is much faster as compared to the Fisher metric and in general
# it is a good approximation.
Δ²=distanceSqrMat(logEuclidean, Pset)

# return a matrix of type Float64
Δ²64=distanceSqrMat(Float64, logEuclidean, Pset)

# Get the full matrix of inter-distances
fullΔ²=Hermitian(Δ², :L)
PosDefManifold.distanceMatFunction
    (1) distanceMat(metric::Metric, 𝐏::ℍVector;
    <⏩=true>)

    (2) distanceMat(type::Type{T}, metric::Metric, 𝐏::ℍVector;
    <⏩=true>) where T<:AbstractFloat

Given a 1d array $𝐏$ of $k$ positive definite matrices ${P_1,...,P_k}$ of ℍVector type, create the $k⋅k$ real LowerTriangular matrix comprising elements $δ(P_i, P_j)\textrm{, for all }i>=j$.

This is the lower triangular matrix holding all inter-distances (zero on diagonal), using the specified metric, of type Metric::Enumerated type, giving rise to distance $δ$. See distance.

Only the lower triangular part is computed in order to optimize memory use.

By default, the result matrix is of type Float32. The type can be changed to another real type using method (2).

The elements of this matrix are the square root of distanceSqrMat.

<optional keyword arguments>:

  • if ⏩=true the computation of inter-distances is multi-threaded.
Nota Bene

Multi-threading is automatically disabled if Julia is instructed to use only one thread. See Threads.

See: distance.

Examples

using PosDefManifold
# Generate a set of 4 random 10x10 SPD matrices
Pset=randP(10, 4) # or, using unicode: 𝐏=randP(10, 4)
Δ=distanceMat(Fisher, Pset)

# return a matrix of type Float64
Δ64=distanceMat(Float64, Fisher, Pset)

# Get the full matrix of inter-distances
fullΔ=Hermitian(Δ, :L)
PosDefManifold.laplacianFunction
laplacian(Δ²::𝕃{S}, epsilon::Real=0;
          <densityInvariant=false>) where S<:Real

Given a LowerTriangular matrix of squared inter-distances $Δ^2$, return the lower triangular part of the so-called normalized Laplacian or density-invariant normalized Laplacian, which in both cases is a symmetric Laplacian. The elements of the Laplacian are of the same type as the elements of $Δ^2$. The result is a LowerTriangular matrix.

The definition of Laplacian given by Lafon (2004)🎓 is implemented:

First, a Gaussian radial basis functions, known as Gaussian kernel or heat kernel, is applied to all elements of $Δ^2$, such as

$W_{ij} = exp\bigg(\frac{\displaystyle{-Δ^2_{ij}}}{\displaystyle{2ε}}\bigg)$,

where $ε$ is the bandwidth of the kernel.

If <optional keyword argument> densityInvariant=true is used, then the density-invariant transformation is applied

$W <- E^{-1}WE^{-1}$

where $E$ is the diagonal matrix holding on the main diagonal the sum of the rows (or columns) of $W$.

Finally, the normalized Laplacian (density-invariant or not) is defined as

$Ω = D^{-1/2}WD^{-1/2}$,

where $D$ is the diagonal matrix holding on the main diagonal the sum of the rows (or columns) of $W$.

If you do not provide argument epsilon, the bandwidth $ε$ is set to the median of the elements of squared distance matrix $Δ^2_{ij}$. Another educated guess is the dimension of the original data, that is, the data that has been used to compute the squared distance matrix. For positive definite matrices this is $n(n-1)/2$, where $n$ is the dimension of the matrices. Still another is the dimension of the ensuing spectralEmbedding space. Keep in mind that by tuning the epsilon parameter (which must be positive) you can control both the rate of compression of the embedding space and the spread of points in the embedding space. See Coifman et al. (2008)🎓 for a discussion on $ε$.

Nota Bene

The Laplacian as here defined can be requested for any input matrix of squared inter-distances, for example, those obtained on scalars or on vectors using appropriate metrics. In any case, only the lower triangular part of the Laplacian is taken as input. See typecasting matrices.

See also: distanceSqrMat, laplacianEigenMaps, spectralEmbedding.

Examples

using PosDefManifold
# Generate a set of 4 random 10x10 SPD matrices
Pset=randP(10, 4) # or, using unicode: 𝐏=randP(10, 4)
Δ²=distanceSqrMat(Fisher, Pset)
Ω=laplacian(Δ²)

# density-invariant Laplacian
Ω=laplacian(Δ²; densityInvariant=true)

# increase the bandwidth
r=size(Δ², 1)
myεFactor=0.1
med=Statistics.median([Δ²[i, j] for j=1:r-1 for i=j+1:r])
ε=2*myεFactor*med
Ω=laplacian(Δ², ε; densityInvariant=true)
PosDefManifold.laplacianEigenMapsFunction
    laplacianEigenMaps(Ω::𝕃{S}, q::Int;
    <
    tol::Real=0.,
    maxiter::Int=300,
    verbose=false >) where S<:Real

alias: laplacianEM

Given the lower triangular part of a Laplacian $Ω$ (see laplacian ) return the eigen maps in $q$ dimensions, i.e., the $q$ eigenvectors of the Laplacian associated with the largest $q$ eigenvalues, excluding the first (which is always equal to 1.0). The eigenvectors are of the same type as $Ω$. They are all divided element-wise by the first eigenvector (see Lafon, 2004🎓).

The eigenvectors of the Laplacian are computed by the power iterations+modified Gram-Schmidt method (see powerIterations), allowing the execution of this function for Laplacian matrices of very large size.

Return the 4-tuple $(Λ, U, iterations, convergence)$, where:

  • $Λ$ is a $q⋅q$ diagonal matrix holding on diagonal the eigenvalues corresponding to the $q$ dimensions of the Laplacian eigen maps,
  • $U$ holds in columns the $q$ eigenvectors holding the $q$ coordinates of the points in the embedding space,
  • $iterations$ is the number of iterations executed by the power method,
  • $convergence$ is the convergence attained by the power method.

Using the notion of Laplacian, spectral embedding seek a low-dimension representation of the data emphasizing local neighbothood information while neglecting long-distance information. The embedding is non-linear, however the embedding space is Euclidean. The eigenvectors of $U$ holds the coordinates of the points in the embedding space (typically two- or three-dimensional for plotting or more for clustering). Spectral embedding is done for plotting data in low-dimension, clustering, imaging, classification, following their trajectories over time or other dimensions, and much more. For examples of applications see Ridrigues et al. (2018) 🎓 and references therein.

Arguments:

  • $Ω$ is a real LowerTriangular normalized Laplacian obtained by the laplacian function,
  • $q$ is the dimension of the Laplacian eigen maps;
  • The following are <optional keyword arguments> for the power iterations:
    • tol is the tolerance for convergence (see below),
    • maxiter is the maximum number of iterations allowed,
    • if verbose is true, the convergence at all iterations will be printed.
Nota Bene

The maximum value of $q$ that can be requested is $n-1$, where $n$ is the size of the Laplacian. In general, $q=2$ or $q=3$ is requested.

$tol$ defaults to the square root of Base.eps of the (real) type of $Ω$. This corresponds to requiring equality for the convergence criterion over two successive power iterations of about half of the significant digits.

See also: distanceSqrMat, laplacian, spectralEmbedding.

Examples

using PosDefManifold
# Generate a set of 4 random 10x10 SPD matrices
Pset=randP(10, 4)
Δ²=distanceSqrMat(Fisher, Pset)
Ω=laplacian(Δ²)
evalues, maps, iterations, convergence=laplacianEM(Ω, 2)
evalues, maps, iterations, convergence=laplacianEM(Ω, 2; verbose=true)
evalues, maps, iterations, convergence=laplacianEM(Ω, 2; verbose=true, maxiter=500)
PosDefManifold.spectralEmbeddingFunction
    (1) spectralEmbedding(metric::Metric, 𝐏::ℍVector, q::Int, epsilon::Real=0;
    <
    tol::Real=0.,
    maxiter::Int=300,
    densityInvariant=false,
    verbose=false,
    ⏩=true >)

    (2) spectralEmbedding(type::Type{T}, metric::Metric, 𝐏::ℍVector, q::Int, epsilon::Real=0;
    < same optional keyword arguments as in (1) >) where T<:Real

alias: spEmb

Given a 1d array $𝐏$ of $k$ positive definite matrices ${P_1,...,P_k}$ (real or complex), compute its eigen maps in $q$ dimensions.

This function runs one after the other the functions:

By default all computations above are done with Float32 precision. Another real type can be requested using method (2), where the type argument is defined.

Return the 4-tuple (Λ, U, iterations, convergence), where:

  • $Λ$ is a $q⋅q$ diagonal matrix holding on diagonal the eigenvalues corresponding to the $q$ dimensions of the Laplacian eigen maps,
  • $U$ holds in columns the $q$ eigenvectors holding the $q$ coordinates of the points in the embedding space,
  • $iterations$ is the number of iterations executed by the power method,
  • $convergence$ is the convergence attained by the power method.

Arguments:

  • metric is the metric of type Metric::Enumerated type used for computing the inter-distances,
  • $𝐏$ is a 1d array of $k$ positive matrices of ℍVector type,
  • $q$ is the dimension of the Laplacian eigen maps,
  • $epsilon$ is the bandwidth of the Laplacian (see laplacian);
  • The following <optional keyword argument> applyies for computing the inter-distances:
    • if ⏩=true (default) the computation of inter-distances is multi-threaded.
  • The following <optional keyword argument> applyies to the computation of the Laplacian by the laplacian function:
    • if densityInvariant=true the density-invariant Laplacian is computed (see laplacian).
  • The following are <optional keyword arguments> for the power method iterative algorithm invoked by laplacianEigenMaps:
    • tol is the tolerance for convergence of the power method (see below),
    • maxiter is the maximum number of iterations allowed for the power method,
    • if verbose=true the convergence at all iterations will be printed;
Nota Bene

$tol$ defaults to the square root of Base.eps of the Float32 type (1) or of the type passed as argumant (2). This corresponds to requiring equality for the convergence criterion over two successive power iterations of about half of the significant digits.

Multi-threading is automatically disabled if Julia is instructed to use only one thread. See Threads.

See also: distanceSqrMat, laplacian, laplacianEigenMaps.

Examples

using PosDefManifold
# Generate a set of k random 10x10 SPD matrices
k=10
Pset=randP(10, k)
evalues, maps, iter, conv=spectralEmbedding(Fisher, Pset, 2)

# show convergence information
evalues, maps, iter, conv=spectralEmbedding(Fisher, Pset, 2; verbose=true)

# use Float64 precision.
evalues, maps, iter, conv=spectralEmbedding(Float64, Fisher, Pset, 2)

using Plots
# check eigevalues and eigenvectors
plot(diag(evalues))
plot(maps[:, 1])
plot!(maps[:, 2])
plot!(maps[:, 3])

# plot the data in the embedded space
plot(maps[:, 1], maps[:, 2], seriestype=:scatter, title="Spectral Embedding", label="Pset")

# try a different value of epsilon
evalues, maps, iter, conv=spEmb(Fisher, Pset, k-1, 0.01; maxiter=1000)
plot(maps[:, 1], maps[:, 2], seriestype=:scatter, title="Spectral Embedding", label="Pset")
# see the example in `Laplacian` function for more on this

Means

FunctionDescription
meanWeighted Fréchet mean (wFm) of a scalar or matrix set using any metric
meansAs above for several sets at once
generalizedMeanGeneralized wFm of a matrix set
geometricMean, gMeanwFm of a matrix set minimizing the dispersion according to the Fisher metric (iterative)
geometricpMean, gpMeanrobust wFm of a matrix set minimizing the p-dispersion according to the Fisher metric (iterative)
logdet0Mean, ld0MeanwFm of a matrix set according to the logdet0 metric (iterative)
wasMeanwFm of a matrix set according to the Wasserstein metric (iterative)
powerMeanPower wFm of a matrix set (iterative)
inductiveMean, indMeanRecursive Fréchet mean of a matrix set (constructive)
midrangeGeometric midrange of two matrices

Statistics.meanFunction
    (1) mean(metric::Metric, P::ℍ{T}, Q::ℍ{T}) where T<:RealOrComplex

    (2) mean(metric::Metric, D::𝔻{T}, E::𝔻{T}) where T<:Real

    (3) mean(metric::Metric, 𝐏::ℍVector;
        <
        w::Vector=[],
        ✓w=true,
        init::Union{ℍ, Nothing}=nothing,
        tol::Real=0.,
        verbose=false,
        ⏩=true >)

    (4) mean(metric::Metric, 𝐃::𝔻Vector;
    < same optional keyword arguments as in (3) >)

(1) Mean of two positive definite matrices, passed in arbitrary order as arguments $P$ and $Q$, using the specified metric of type Metric::Enumerated type. The order is arbitrary as all metrics implemented in PosDefManifold are symmetric. This is the midpoint of the geodesic. For the weighted mean of two positive definite matrices use instead the geodesic function. $P$ and $Q$ must be flagged as Hermitian. See typecasting matrices.

(2) Like in (1), but for two real diagonal positive definite matrices $D$ and $E$.

(3) Fréchet mean of an 1d array $𝐏$ of $k$ positive definite matrices $𝐏={P_1,...,P_k}$ of ℍVector type, with optional non-negative real weights $w={w_1,...,w_k}$ and using the specified metricas in (1).

(4) Fréchet mean of an 1d array $𝐃$ of $k$ positive definite matrices $𝐃={D_1,...,D_k}$ of 𝔻Vector type, with optional non-negative real weights $w={w_1,...,w_k}$ and using the specified metricas in (1).

If you don't pass a weight vector with <optional keyword argument>$w$, return the unweighted mean.

If <optional keyword argument>✓w=true (default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.

Adopting the Fisher, logdet0 and Wasserstein metric in (3) and the logdet0 metric in (4), the mean is computed by means of an iterative algorithm. A particular initialization for these algorithms can be provided passing an Hermitian matrix as <optional keyword argument>init. The convergence for these algorithm is required with a tolerance given by <optional keyword argument>tol. if verbose=true the covergence attained at each iteration is printed. Other information such as if the algorithm has diverged is also printed. For more options in computing these means call directly functions geometricMean, logdet0Mean and wasMean, which are called hereby. For the meaning of the tol default value see the documentation of these functions. See also the robust mean function geometricpMean, which cannot be called from here. Notice that arguments init and tol have an effect only for the aferomentioned metrics in methods (3) and (4).

For (3) and (4), if ⏩=true (default), the computation of the mean is multi-threaded for all metrics.

Nota Bene

Multi-threading is automatically disabled if Julia is instructed to use only one thread. See Threads.

Math

The Fréchet mean of a set of $k$ matrices ${P_1, P_2,..., P_k}$ weighted by ${w_1, w_2,..., w_k}:\sum_{i=1}^{k}w_i=1$ for the supported metrics are, for those with closed form expression:

Metricweighted Fréchet mean
Euclidean$\sum_{i=1}^{k}w_i P_i$
invEuclidean$\big(\sum_{i=1}^{k}w_i P_i^{-1}\big)^{-1}$
ChoEuclidean$TT^*$, where $T=bL_P + aL_Q$
logEuclidean$\textrm{exp}\big(\sum_{i=1}^{k}w_i\hspace{1pt} \textrm{log}P_i \big)$
logCholesky$TT^*$, where $T=\sum_{i=1}^{k}(w_kS_k)+\sum_{i=1}^{k}(w_k\textrm{log}D_k)$
Jeffrey$A^{1/2}\big(A^{-1/2}HA^{-1/2}\big)^{1/2}A^{1/2}$

and for those that are found by an iterative algorithm and that verify an equation:

Metricequation verified by the weighted Fréchet mean
Fisher$\sum_{i=1}^{k}w_i\textrm{log}\big(G^{-1/2} P_k G^{-1/2}\big)=0.$
logdet0$\sum_{i=1}^{k}w_i\big(\frac{1}{2}P_i+\frac{1}{2}G\big)^{-1}=G^{-1}$
VonNeumannN.A.
Wasserstein$G=\sum_{i=1}^{k}w_i\big( G^{1/2} P_i G^{1/2}\big)^{1/2}$

legend:$L_X$, $S_X$ and $D_X$ are the Cholesky lower triangle of $X$, its strictly lower triangular part and diagonal part, respectively (hence, $S_X+D_X=L_X$, $L_XL_X^*=X$). $A$ and $H$ are the weighted arithmetic and weighted harmonic mean, respectively.

See: geodesic, mean, Fréchet mean.

Examples

using LinearAlgebra, Statistics, PosDefManifold
# Generate 2 random 3x3 SPD matrices
P=randP(3)
Q=randP(3)
M=mean(logdet0, P, Q) # (1)
M=mean(Euclidean, P, Q) # (1)

# passing several matrices and associated weights listing them
# weights vector, does not need to be normalized
R=randP(3)
mean(Fisher, ℍVector([P, Q, R]); w=[1, 2, 3])

# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4)
weights=[1, 2, 3, 1]
# passing a vector of Hermitian matrices (ℍVector type)
M=mean(Euclidean, Pset; w=weights) # (2) weighted Euclidean mean
M=mean(Wasserstein, Pset)  # (2) unweighted Wassertein mean
# display convergence information when using an iterative algorithm
M=mean(Fisher, Pset; verbose=true)

# run multi-threaded when the number of matrices is high
using BenchmarkTools
Pset=randP(20, 160)
@benchmark(mean(logEuclidean, Pset; ⏩=false)) # single-threaded
@benchmark(mean(logEuclidean, Pset)) # multi-threaded
mean(metric::Metric, ν::Vector{T}) where T<:RealOrComplex

Mean of $k$ real or complex scalars, using the specified metric of type Metric::Enumerated type. Note that using the Fisher, logEuclidean and Jeffrey metric, the resulting mean is the scalar geometric mean. Note also that the code of this method is in unit statistics.jl, while the code for all the others is in unit riemannianGeometry.jl.

Examples

using PosDefManifold
# Generate 10 random numbers distributed as a chi-square with 2 df.
ν=[randχ²(2) for i=1:10]
arithmetic_mean=mean(Euclidean, ν)
geometric_mean=mean(Fisher, ν)
harmonic_mean=mean(invEuclidean, ν)
harmonic_mean<=geometric_mean<=arithmetic_mean # AGH inequality
PosDefManifold.meansFunction
    (1) means(metric::Metric, 𝒫::ℍVector₂;
    <⏩=true>)

    (2) means(metric::Metric, 𝒟::𝔻Vector₂;
    <⏩=true>)

(1) Given a 2d array $𝒫$ of positive definite matrices as an ℍVector₂ type compute the Fréchet mean for as many ℍVector type objects as hold in $𝒫$, using the specified metric of type Metric::Enumerated type. Return the means in a vector of Hermitian matrices, that is, as an ℍVector type.

(2) Given a 2d array $𝒟$ of real positive definite matrices as an 𝔻Vector₂ type compute the Fréchet mean for as many 𝔻Vector type objects as hold in $𝒟$, using the specified metric of type Metric::Enumerated type. Return the means in a vector of Diagonal matrices, that is, as a 𝔻Vector type.

The weigted Fréchet mean is not supported in this function.

If ⏩=true (default) the computation of the means is multi-threaded.

Nota Bene

Multi-threading is automatically disabled if Julia is instructed to use only one thread. See Threads.

See also: mean.

Examples

 using PosDefManifold
 # Generate a set of 4 random 3x3 SPD matrices
 Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)
 # Generate a set of 40 random 4x4 SPD matrices
 Qset=randP(3, 40) # or, using unicode: 𝐐=randP(3, 40)
 # listing directly ℍVector objects
 means(logEuclidean, ℍVector₂([Pset, Qset])) # or: means(logEuclidean, ℍVector₂([𝐏, 𝐐]))
 # note that [𝐏, 𝐐] is actually a ℍVector₂ type object

 # creating and passing an object of ℍVector₂ type
 sets=ℍVector₂(undef, 2) # or: 𝒫=ℍVector₂(undef, 2)
 sets[1]=Pset # or: 𝒫[1]=𝐏
 sets[2]=Qset # or: 𝒫[2]=𝐐
 means(logEuclidean, sets) # or: means(logEuclidean, 𝒫)

 # going multi-threated

 # first, create 20 sets of 200 50x50 SPD matrices
 sets=ℍVector₂([randP(50, 200) for i=1:20])

 # How much computing time we save ?
 # (example min time obtained with 4 threads & 4 BLAS threads)
 using BenchmarkTools

 # non multi-threaded, mean with closed-form solution
 @benchmark(means(logEuclidean, sets; ⏩=false)) # (6.196 s)

 # multi-threaded, mean with closed-form solution
 @benchmark(means(logEuclidean, sets)) # (1.897 s)

 sets=ℍVector₂([randP(10, 200) for i=1:10])

 # non multi-threaded, mean with iterative solution
 # wait a bit
 @benchmark(means(Fisher, sets; ⏩=false)) # (4.672 s )

 # multi-threaded, mean with iterative solution
 @benchmark(means(Fisher, sets)) # (1.510 s)
PosDefManifold.generalizedMeanFunction
    generalizedMean(𝐏::Union{ℍVector, 𝔻Vector}, p::Real;
    <
    w::Vector=[],
    ✓w=true,
    ⏩=true >)

Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type or real positive definite diagonal matrices of 𝔻Vector type and optional non-negative real weights vector $w={w_1,...,w_k}$, return the weighted generalized means$G$ with real parameter $p$, that is,

$G=\big(\sum_{i=1}^{k}w_iP_i^p\big)^{1/p}$.

If you don't pass a weight vector with <optional keyword argument>$w$, return the unweighted generalized mean

$G=\big(\sum_{i=1}^{k}P_i^p\big)^{1/p}$.

If <optional keyword argument>✓w=true (default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the weights each time.

If <optional key argmuent> ⏩=true the computation of the generalized mean is multi-threaded.

Nota Bene

Multi-threading is automatically disabled if Julia is instructed to use only one thread. See Threads.

The following special cases for parameter $p$ are noteworthy:

Notice that when matrices in 𝐏 all pair-wise commute, for instance if the matrices are diagonal, the generalized means coincide with the power means for any $p∈[-1, 1]$ and for $p=0.5$ it coincides also with the Wasserstein mean. For this reason the generalized means are used as default initialization of both the powerMean and wasMean algorithm.

See: generalized means.

See also: powerMean, wasMean, mean.

Examples

using LinearAlgebra, Statistics, PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)

# weights vector, does not need to be normalized
weights=[1, 2, 3, 1]

# unweighted mean
G = generalizedMean(Pset, 0.25) # or: G = generalizedMean(𝐏, 0.25)

# weighted mean
G = generalizedMean(Pset, 0.5; w=weights)

# with weights previously normalized we can set ✓w=false
weights=weights./sum(weights)
G = generalizedMean(Pset, 0.5; w=weights, ✓w=false)

# run multi-threaded when the number of matrices is high
using BenchmarkTools
Pset=randP(20, 160)
@benchmark(generalizedMean(Pset; ⏩=false)) # single-threaded
@benchmark(generalizedMean(Pset)) # multi-threaded
PosDefManifold.geometricMeanFunction
    geometricMean(𝐏::Union{ℍVector, 𝔻Vector};
    <
    w::Vector=[],
    ✓w=true,
    init=nothing,
    tol::Real=0.,
    maxiter::Int=500,
    adaptStepSize::Bool=true,
    verbose=false,
    ⏩=true >)

alias: gmean

Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type or diagonal matrices of 𝔻Vector type and optional non-negative real weights vector $w={w_1,...,w_k}$, return the 3-tuple $(G, iter, conv)$, where $G$ is the mean according to the Fisher metric and $iter$, $conv$ are the number of iterations and convergence attained by the algorithm. Mean $G$ is the unique positive definite matrix satisfying

$\sum_{i=1}^{k}w_i\textrm{log}\big(G^{-1/2} P_i G^{-1/2}\big)=0.$

For estimating it, this function implements the well-known gradient descent algorithm, but with an exponential decaying step size $ς$, yielding iterations

$G ←G^{1/2}\textrm{exp}\big(ς\sum_{i=1}^{k}w_i\textrm{log}(G^{-1/2} P_i G^{-1/2})\big)G^{1/2}.$

If you don't pass a weight vector with <optional keyword argument>$w$, return the unweighted geometric mean.

If <optional keyword argument>✓w=true (default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.

The following are more <optional keyword arguments>:

  • init is a matrix to be used as initialization for the mean. If no matrix is provided, the log Euclidean mean will be used,
  • tol is the tolerance for the convergence (see below).
  • maxiter is the maximum number of iterations allowed
  • if verbose=true, the convergence attained at each iteration and the step size $ς$ is printed. Also, a warning is printed if convergence is not attained.
  • if ⏩=true the iterations are multi-threaded (see below).
  • if adaptStepSize=false the step size ς is fixed to 1 at all iterations.

If the input is a 1d array of $k$ real positive definite diagonal matrices the solution is available in closed-form as the log Euclidean mean, hence the <optional keyword arguments> init, tol and verbose have no effect and return the 3-tuple $(G, 1, 0)$. See the log Euclidean metric.

Nota Bene

Multi-threading is automatically disabled if Julia is instructed to use only one thread. See Threads.

In normal circumstances this algorithm converges monothonically. If the algorithm diverges and verbose is true a warning is printed indicating the iteration when this happened.

The exponential decaying step size features a faster convergence rate as compared to the fixed step size $ς=1$ that is usually adopted. The decaying rate is inversely proportional to maxiter, thus, increase/decrease maxiter in order to set a slower/faster decaying rate. maxiter should not be set too low though.

$tol$ defaults to the square root of Base.eps of the nearest real type of data input $𝐏$. This corresponds to requiring the norm of the satisfying matrix equation divided by the number of elements to vanish for about half the significant digits.

See: Fisher metric.

See also: geometricpMean, powerMean, wasMean, logdet0Mean, mean.

Examples

using LinearAlgebra, PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)

# unweighted mean
G, iter, conv = geometricMean(Pset) # or G, iter, conv = geometricMean(𝐏)

# weights vector, does not need to be normalized
weights=[1, 2, 3, 1]

# weighted mean
G, iter, conv = geometricMean(Pset, w=weights)

# print the convergence at all iterations
G, iter, conv = geometricMean(Pset; verbose=true)

# now suppose Pset has changed a bit, initialize with G to hasten convergence
Pset[1]=ℍ(Pset[1]+(randP(3)/100))
G, iter, conv = geometricMean(Pset; w=weights, ✓w=true, verbose=true, init=G)

# run multi-threaded when the number of matrices is high
using BenchmarkTools
Pset=randP(20, 120)
@benchmark(geometricMean(Pset; ⏩=false)) # single-threaded
@benchmark(geometricMean(Pset)) # multi-threaded

# show the mean and the input points using spectral embedding
using Plots
k=80
Pset=randP(20, k)
G, iter, conv = geometricMean(Pset)
push!(Pset, G)
Λ, U, iter, conv=spectralEmbedding(Fisher, Pset, 2; verbose=true)
plot(U[1:k, 1], U[1:k, 2], seriestype=:scatter, title="Spectral Embedding", label="Pset")
plot!(U[k+1:k+1, 1], U[k+1:k+1, 2], seriestype=:scatter, label="mean")
PosDefManifold.geometricpMeanFunction
    geometricpMean(𝐏::ℍVector, p::Real=0.5;
    <
    w::Vector=[],
    ✓w=true,
    init=nothing,
    tol::Real=0.,
    maxiter::Int=500,
    adaptStepSize=true,
    verbose=false,
    ⏩=true >)

alias: gpmean

Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type, a real parameter $0<p<=1$ and optional non-negative real weights vector $w={w_1,...,w_k}$, return the 3-tuple $(G, iter, conv)$, where $G$ is the p-mean, i.e., the mean according to the Fisher metric minimizing the p-dispersion (see below) and $iter$, $conv$ are the number of iterations and convergence attained by the algorithm.

This function implements the p-dispersion gradient descent algorithm with step-size $ς$ (to be published), yielding iterations

$G ←G^{1/2}\textrm{exp}\big(ς\sum_{i=1}^{k}pδ^2(G, P_i)^{p-1}w_i\textrm{log}(G^{-1/2} P_i G^{-1/2})\big)G^{1/2}.$

  • if $p=1$ this yields the geometric mean (implemented specifically in geometricMean).
  • if $p=0.5$ this yields the geometric median (default).

If you don't pass a weight vector with <optional keyword argument>$w$, return the unweighted geometric-p mean.

If <optional keyword argument>✓w=true (default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.

The following are more <optional keyword arguments>:

  • init is a matrix to be used as initialization for the mean. If no matrix is provided, the log Euclidean mean will be used,
  • tol is the tolerance for the convergence (see below).
  • maxiter is the maximum number of iterations allowed.
  • if adaptStepSize=true (default) the step size $ς$ for the gradient descent is adapted at each iteration (see below).
  • if verbose=true, the step-size and convergence attained at each iteration is printed. Also, a warning is printed if convergence is not attained.
  • if ⏩=true the iterations are multi-threaded (see below).
Nota Bene

Multi-threading is automatically disabled if Julia is instructed to use only one thread. See Threads.

If the algorithm diverges and verbose is true a warning is printed indicating the iteration when this happened. This algorithm may temporary diverge, still reach convergence. Overall, while all other iterative algorithms implemented in PosDefMaifold are very stable, this is not.

The smaller the parameter $p$ is, the slower and less likely the convergence is. If the algorithm does not converge, try increasing $p$, initializing the algorithm with the output of geometricMean and/or eliminating the otliers from the input set $𝐏$.

If adaptStepSize is true (default) the step-size $ς$ is adapted at each iteration, otherwise a fixed step size $ς=1$ is used. Adapting the step size in general hastens convergence and improves the convergence behavior.

$tol$ defaults to the square root of Base.eps of the nearest real type of data input $𝐏$. This corresponds to requiring the norm of the satisfying matrix equation divided by the number of elements to vanish for about half the significant digits.

See: Fisher metric.

See also: geometricMean, powerMean, wasMean, logdet0Mean, mean.

Examples

using LinearAlgebra, PosDefManifold, Plots

# This examples show that this algorithm is more robust to outliers
# as compared to the standard geometric mean algorithm

# Generate a set of 100 random 10x10 SPD matrices
Pset=randP(10, 100)

# Get the usual geometric mean for comparison
G, iter1, conv1 = geometricMean(Pset, verbose=true)

# change p to observe how the convergence behavior changes accordingly
# Get the median (default)
H, iter2, conv2 = geometricpMean(Pset, verbose=true)
# Get the p-mean for p=0.25
H, iter2, conv2 = geometricpMean(Pset, 0.25, verbose=true)

println(iter1, " ", iter2); println(conv1, " ", conv2)

# move the first matrix in Pset to possibly create an otlier
Pset[1]=geodesic(Fisher, G, Pset[1], 3)
G1, iter1, conv1 = geometricMean(Pset, verbose=true)
H1, iter2, conv2 = geometricpMean(Pset, 0.25, verbose=true)
println(iter1, " ", iter2); println(conv1, " ", conv2)

# collect the geometric and p-means, before and after the
# introduction of the outier in vector of Hermitian matrices `S`.
S=HermitianVector([G, G1, H, H1])

# check the interdistance matrix Δ² to verify that the geometric mean
# after the introduction of the outlier (``G1``) is farther away from
# the geometric mean as compared to how much ``H1`` is further away
# from ``H``, i.e., that element (4,3) is much smaller than element (2,1).
Δ²=distance²Mat(Float64, Fisher, S)

# how far are all these matrices from all the others?
fullΔ²=Hermitian(Δ², :L)
dist=[sum(fullΔ²[:, i]) for i=1:size(fullΔ², 1)]

# plot the matrices in `S` using spectral embedding.
using Plots
Λ, U, iter, conv = laplacianEM(laplacian(Δ²), 3;  verbose=true)
plot([U[1, 1]], [U[1, 2]], seriestype=:scatter, label="g-mean")
plot!([U[2, 1]], [U[2, 2]], seriestype=:scatter, label="g-mean outlier")
plot!([U[3, 1]], [U[3, 2]], seriestype=:scatter, label="p-mean")
plot!([U[4, 1]], [U[4, 2]], seriestype=:scatter, label="p-mean outlier")

# estimate how much you gain running the algorithm in multi-threaded mode
using BenchmarkTools
Pset=randP(20, 120)
@benchmark(geometricpMean(Pset; ⏩=true)) # single-threaded
@benchmark(geometricpMean(Pset)) # multi-threaded
PosDefManifold.logdet0MeanFunction
    logdet0Mean(𝐏::Union{ℍVector, 𝔻Vector};
    <
    w::Vector=[],
    ✓w=true,
    init=nothing,
    tol::Real=0.,
    maxiter::Int=500,
    verbose=false,
    ⏩=true >)

alias: ld0Mean

Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type or real positive definite diagonal matrices of 𝔻Vector type and optional non-negative real weights vector $w={w_1,...,w_k}$, return the 3-tuple $(G, iter, conv)$, where $G$ is the mean according to the logdet zero metric and $iter$, $conv$ are the number of iterations and convergence attained by the algorithm. Mean $G$ is the unique positive definite matrix satisfying

$\sum_{i=1}^{k}w_i\big(\frac{1}{2}P_i+\frac{1}{2}G\big)^{-1}-G^{-1}=0$.

For estimating it, this function implements the fixed-point iteration algorithm suggested by (Moakher, 2012, p315)🎓, yielding iterations

$G ← \frac{1}{2}\big(\sum_{i=1}^{k}w_i(P_i+G)^{-1}\big)^{-1}$.

If you don't pass a weight vector with <optional keyword argument>$w$, return the unweighted logdet zero mean.

If <optional keyword argument>✓w=true (default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.

The following are more <optional keyword arguments>:

  • init is a matrix to be used as initialization for the mean. If no matrix is provided, the log Euclidean mean will be used,
  • tol is the tolerance for the convergence (see below).
  • maxiter is the maximum number of iterations allowed.
  • if verbose=true, the convergence attained at each iteration is printed and a warning is printed if convergence is not attained.
  • if ⏩=true the iterations are multi-threaded (see below).
Nota Bene

Multi-threading is automatically disabled if Julia is instructed to use only one thread. See Threads.

In normal circumstances this algorithm converges monothonically. If the algorithm diverges and verbose is true a warning is printed indicating the iteration when this happened.

$tol$ defaults to 100 times the square root of Base.eps of the nearest real type of data input $𝐏$. This corresponds to requiring the square root of the relative convergence criterion over two successive iterations to vanish for about half the significant digits minus 2.

See: logdet zero metric, modified Bhattacharyya mean.

See also: powerMean, wasMean, logdet0Mean, mean.

Examples

using LinearAlgebra, PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)

# unweighted mean
G, iter, conv = logdet0Mean(Pset) # or G, iter, conv = logdet0Mean(𝐏)

# weights vector, does not need to be normalized
weights=[1, 2, 3, 1]

# weighted mean
G, iter, conv = logdet0Mean(Pset, w=weights)

# print the convergence at all iterations
G, iter, conv = logdet0Mean(Pset; w=weights, verbose=true)

# suppose Pset has changed a bit; initialize with G to hasten convergence
Pset[1]=ℍ(Pset[1]+(randP(3)/100))
G, iter, conv = logdet0Mean(Pset; w=weights, ✓w=false, verbose=true, init=G)

# estimate how much you gain running the algorithm in multi-threaded mode
using BenchmarkTools
Pset=randP(20, 120)
@benchmark(logdet0Mean(Pset; ⏩=false)) # single-threaded
@benchmark(logdet0Mean(Pset)) # multi-threaded
PosDefManifold.wasMeanFunction
    wasMean(𝐏::Union{ℍVector, 𝔻Vector};
    <
    w::Vector=[],
    ✓w=true,
    init=nothing,
    tol::Real=0.,
    maxiter::Int=500,
    verbose=false,
    ⏩=true >)

Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type or real positive definite diagonal matrices of 𝔻Vector type and optional non-negative real weights vector $w={w_1,...,w_k}$, return the 3-tuple $(G, iter, conv)$, where $G$ is the mean according to the Wasserstein metric and $iter$, $conv$ are the number of iterations and convergence attained by the algorithm. Mean $G$ is the unique positive definite matrix satisfying

$G=\sum_{i=1}^{k}w_i\big( G^{1/2} P_i G^{1/2}\big)^{1/2}$.

For estimating it, this function implements the fixed-point iterative algorithm proposed by (Álvarez-Esteban et al., 2016)🎓:

$G ← G^{-1/2}\big(\sum_{i=1}^{k} w_i(G^{1/2}P_i G^{1/2})^{1/2}\big)^2 G^{-1/2}$.

If you don't pass a weight vector with <optional keyword argument>$w$, return the unweighted Wassertein mean.

If <optional keyword argument>✓w=true (default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and they should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.

The following are more <optional keyword arguments>:

  • init is a matrix to be used as initialization for the mean. If no matrix is provided, the instance of generalized means with $p=0.5$ will be used,
  • tol is the tolerance for the convergence (see below).
  • maxiter is the maximum number of iterations allowed.
  • if verbose=true, the convergence attained at each iteration is printed and a warning is printed if convergence is not attained.
  • if ⏩=true the iterations are multi-threaded (see below).

If the input is a 1d array of $k$ real positive definite diagonal matrices the solution is available in closed-form as the modified Bhattacharyya mean, hence the <optional keyword arguments> init, tol and verbose have no effect and return the 3-tuple $(G, 1, 0)$. See modified Bhattacharyya mean.

Nota Bene

Multi-threading is automatically disabled if Julia is instructed to use only one thread. See Threads.

In normal circumstances this algorithm converges monothonically. If the algorithm diverges and verbose is true a warning is printed indicating the iteration when this happened.

$tol$ defaults to the square root of Base.eps of the nearest real type of data input $𝐏$. This corresponds to requiring the norm of the satisfying matrix equation divided by the number of elements to vanish for about half the significant digits.

See: Wasserstein metric.

See also: powerMean, wasMean, logdet0Mean, mean.

Examples

using LinearAlgebra, PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)

# unweighted mean
G, iter, conv = wasMean(Pset) # or: G, iter, conv = wasMean(𝐏)

# weights vector, does not need to be normalized
weights=[1, 2, 3, 1]

# weighted mean
G, iter, conv = wasMean(Pset; w=weights)

# print the convergence at all iterations
G, iter, conv = wasMean(Pset; w=weights, verbose=true)

# suppose 𝐏 has changed a bit; initialize with G to hasten convergence
Pset[1]=ℍ(Pset[1]+(randP(3)/100))
G, iter, conv = wasMean(Pset; w=weights, verbose=true, init=G)

# estimate how much you gain running the algorithm in multi-threaded mode
using BenchmarkTools
Pset=randP(20, 120)
@benchmark(wasMean(Pset; ⏩=false)) # single-threaded
@benchmark(wasMean(Pset)) # multi-threaded
PosDefManifold.powerMeanFunction
    powerMean(𝐏::Union{ℍVector, 𝔻Vector}, p::Real;
    <
    w::Vector=[],
    ✓w=true,
    init=nothing,
    tol::Real=0.,
    maxiter::Int=500,
    verbose=false,
    ⏩=true >)

Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type or real positive definite diagonal matrices of 𝔻Vector type, an optional non-negative real weights vector $w={w_1,...,w_k}$ and a real parameter p$\in[-1, 1]$, return the 3-tuple $(G, iter, conv)$, where $G$ is Lim and Palfia (2012)'s power means of order $p$ and $iter$, $conv$ are the number of iterations and convergence attained by the algorithm, respectively. Mean $G$ is the unique positive definite matrix satisfying

$G=\sum_{i=1}^{k}(w_iG\textrm{#}_pP_i)$,

where $G\textrm{#}_pP_i$ is the Fishergeodesic equation. In particular:

  • with $p=-1$ this is the harmonic mean (see the inverse Euclidean metric),
  • with $p=+1$ this is the arithmetic mean (see the Euclidean metric),
  • at the limit of $p$ evaluated at zero from both side this is the geometric mean (see Fisher metric).

For estimating power means for $p\in(-1, 1)$, this function implements the fixed-point iterative algorithm of (Congedo et al., 2017b)🎓. For $p=0$ (geometric mean) this algorithm is run two times with a small positive and negative value of $p$ and the geometric mean of the two resulting means is returned, as suggested in (Congedo et al., 2017b)🎓. This way of estimating the geometric mean of a set of matrices is faster as compared to the usual gradient descent algorithm.

If you don't pass a weight vector with <optional keyword argument>$w$, return the unweighted power mean.

If <optional keyword argument>✓w=true (default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.

The following are more <optional keyword arguments>:

  • init is a matrix to be used as initialization for the mean. If no matrix is provided, the instance of generalized means with parameter $p$ will be used.
  • tol is the tolerance for the convergence (see below).
  • maxiter is the maximum number of iterations allowed.
  • if verbose=true, the convergence attained at each iteration is printed and a warning is printed if convergence is not attained.
  • if ⏩=true the iterations are multi-threaded.

If the input is a 1d array of $k$ real positive definite diagonal matrices the solution is available in closed-form as the generalized mean of order p, hence the <optional keyword arguments> init, tol and verbose have no effect and return the 3-tuple $(G, 1, 0)$. See generalized means.

Nota Bene

Multi-threading is automatically disabled if Julia is instructed to use only one thread. See Threads.

In normal circumstances this algorithm converges monothonically. If the algorithm diverges and verbose is true a warning is printed indicating the iteration when this happened.

$tol$ defaults to the square root of Base.eps of the nearest real type of data input $𝐏$. This corresponds to requiring the norm of the difference of the matrix solution over two successive iterations divided by the number of elements in the matrix to vanish for about half the significant digits.

(2) Like in (1), but for a 1d array $𝐃={D_1,...,D_k}$ of $k$ real positive definite diagonal matrices of 𝔻Vector type. In this case the solution is available in closed-form, hence the <optional keyword arguments> init, tol and verbose have no effect and return the 3-tuple $(G, 1, 0)$. See generalized means.

See: power means, generalized means, modified Bhattacharyya mean.

See also: generalizedMean, wasMean, logdet0Mean, mean.

Examples

using LinearAlgebra, PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)

# unweighted mean
G, iter, conv = powerMean(Pset, 0.5) # or G, iter, conv = powerMean(𝐏, 0.5)

# weights vector, does not need to be normalized
weights=[1, 2, 3, 1]

# weighted mean
G, iter, conv = powerMean(Pset, 0.5; w=weights)

# print the convergence at all iterations
G, iter, conv = powerMean(Pset, 0.5; w=weights, verbose=true)

# suppose 𝐏 has changed a bit; initialize with G to hasten convergence
Pset[1]=ℍ(Pset[1]+(randP(3)/100))
G, iter, conv = powerMean(Pset, 0.5; w=weights, verbose=true, init=G)

# estimate how much you gain running the algorithm in multi-threaded mode
using BenchmarkTools
Pset=randP(20, 120)
@benchmark(powerMean(Pset, 0.5; ⏩=false)) # single-threaded
@benchmark(powerMean(Pset, 0.5)) # multi-threaded
PosDefManifold.inductiveMeanFunction
(1) inductiveMean(metric::Metric, 𝐏::ℍVector)

(2) inductiveMean(metric::Metric, 𝐏::ℍVector, q::Int, Q::ℍ)

alias: indMean

(1) Compute the Fréchet mean of 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type with a law of large number inductive procedure (Ho et al., 2013; Lim and Palfia, 2019; Massart et al., 2018)🎓, such as

$G_1=P_1,$

$G_i=γ(i^{-1}, G_{(i-1)}, P_i), i=2,...,k,$

where $γ(i^{-1}, G_{(i-1)}, P_i)$ is a step on the geodesic relying $G_{(i-1)}$ to $P_i$ with arclength $i^{-1}$ using the specified metric, of type Metric::Enumerated type.

(2) Like (1), but for the set of matrices $𝐐 ∪ 𝐏$, where it is assumed knowledge only of the set $𝐏$, the mean of $𝐐$ (Hermitian matrix argument Q) and the number of matrices in $𝐐$ (integer argument q). This method can be used, for example, for updating a block on-line algorithm, where $𝐏$ is the incoming block, Q the previous mean estimation and q the cumulative number of matrices on which the mean has been computed on-line.

For Fréchet means that do not have a closed form expression, this procedure features a computational complexity amounting to less than two iterations of gradient descent or fixed-point algorithms. This comes at the price of an approximation. In fact, the solution is not invariant to permutations of the matrices in array 𝐏 and convergence to the Fréchet mean for finite samples is not ensured (see Lim and Palfia, 2019; Massart et al., 2018)🎓.

Since the inductive mean uses the geodesic function, it is not available for the Von Neumann metric.

Examples

# A set of 100 matrices for which we want to compute the mean
𝐏=randP(10, 100)

𝐏1=ℍVector(collect(𝐏[i] for i=1:50)) # first 50
𝐏2=ℍVector(collect(𝐏[i] for i=51:100)) # last 50

# inductive mean of the whole set 𝐏
G=inductiveMean(Fisher, 𝐏)

# mean using the usual gradient descent algorithm
H, iter, conv=geometricMean(𝐏)

# inductive mean of 𝐏 given only 𝐏2,
# the number of matrices in 𝐏1 and the mean of 𝐏1
G2=inductiveMean(Fisher, 𝐏2, length(𝐏1), mean(Fisher, 𝐏1))

# average error
norm(G-H)/(dim(G, 1)^2)
norm(G2-H)/(dim(G, 1)^2)
PosDefManifold.midrangeFunction
midrange(metric::Metric, P::ℍ{T}, Q::ℍ{T}) where T<:RealOrComplex

Midrange (average of extremal values) of positive definite matrices $P$ and $Q$. Only the Fisher metric is supported, allowing the so-called geometric midrange. This has been defined in Mostajeran et al. (2019) 🎓 as

$P * Q = \frac{1}{\sqrt{\lambda_(min)}+\sqrt{\lambda_(max)}}\Big(Q+\sqrt{\lambda_(min)*\lambda_(max)}P\Big)$,

where $\lambda_(min)$ and $\lambda_(max)$ are the extremal generalized eigenvalues of $P$ and $Q$.

Examples

P=randP(3)
Q=randP(3)
M=midrange(Fisher, P, Q)

Tangent Space operations

FunctionDescription
logMapLogarithmic map (from manifold to tangent space)
expMapExponential map (from tangent space to manifold)
vecPvectorization of matrices in the tangent space
matPmatrization of matrices in the tangent space (inverse of vecp)
parallelTransport, ptParallel transport of tangent vectors and matrices

PosDefManifold.logMapFunction
(1) logMap(metric::Metric, P::ℍ{T}, G::ℍ{T})

(2) logMap(metric::Metric, 𝐏::ℍVector, G::ℍ{T})
for all the above: where T<:RealOrComplex

(1) Logaritmic Map: map a positive definite matrix $P$ from the SPD or Hermitian manifold into the tangent space at base-point $G$ using the Fisher metric.

$P$ and $G$ must be flagged as Hermitian. See typecasting matrices.

The map is defined as

$Log_G(P)=S=G^{1/2}\textrm{log}\big(G^{-1/2}PG^{-1/2}\big)G^{1/2}$.

metric is a metric of type Metric::Enumerated type.

The result is an Hermitian matrix.

(2) Logarithmic map (1) at base-point $G$ at once for $k$ positive definite matrices in 1d array $𝐏={P_1,...,P_k}$ of ℍVector type.

The result is an ℍVector.

Nota Bene

Currently only the Fisher metric is supported for tangent space operations.

The inverse operation is expMap.

See also: vecP, parallelTransport.

Examples

using PosDefManifold
(1)
P=randP(3)
Q=randP(3)
metric=Fisher
G=mean(metric, P, Q)
# projecting P at the base point given by the geometric mean of P and Q
S=logMap(metric, P, G)

(2)
Pset=randP(3, 4)
# projecting all matrices in Pset at the base point given by their geometric mean.
Sset=logMap(Fisher, Pset, mean(Fisher, Pset))
PosDefManifold.expMapFunction
(1) expMap(metric::Metric, S::ℍ{T}, G::ℍ{T})

(2) expMap(metric::Metric, 𝐒::ℍVector, G::ℍ{T})
for all the above: where T<:RealOrComplex

(1) Exponential Map: map a tangent vector (a matrix) $S$ from the tangent space at base-point $G$ into the SPD or Hermitian manifold (using the Fisher metric).

$S$ and $G$ must be flagged as Hermitian. See typecasting matrices.

The map is defined as

$Exp_G(S)=P=G^{1/2}\textrm{exp}\big(G^{-1/2}SG^{-1/2}\big)G^{1/2}$.

metric is a metric of type Metric::Enumerated type.

The result is an Hermitian matrix.

(2) Exponential map (1) at base-point $G$ at once for $k$ tangent vectors (matrices) in 1d array $𝐒={S_1,...,S_k}$ of ℍVector type.

The result is an ℍVector.

Nota Bene

Currently only the Fisher metric is supported for tangent space operations.

The inverse operation is logMap.

Examples

(1)
using PosDefManifold, LinearAlgebra
P=randP(3)
Q=randP(3)
G=mean(Fisher, P, Q)
# projecting P on the tangent space at the Fisher mean base point G
S=logMap(Fisher, P, G)
# projecting back onto the manifold
P2=expMap(Fisher, S, G)

(2)
Pset=randP(3, 4)
# projecting all matrices in Pset at the base point given by their geometric mean.
G=mean(Fisher, Pset)
Sset=logMap(Fisher, Pset, G)
# projecting back onto the manifold
Pset2=expMap(Fisher, Sset, G)
PosDefManifold.vecPFunction
vecP(S::Union{ℍ{T}, Symmetric{R}};
     range::UnitRange=1:size(S, 2)) where T<:RealOrComplex where R<:Real =

Vectorize a tangent vector (which is an Hermitian or Symmetric matrix) $S$: mat ↦ vec.

It gives weight $1$ to diagonal elements and $√2$ to off-diagonal elements so as to preserve the norm (Barachant et al., 201E)🎓, such as

$∥S∥_F=∥vecP(S)∥_F$.

The result is a vector holding $n(n+1)/2$ elements, where $n$ is the size of $S$.

$S$ must be flagged as Hermitian or Symmetric. See typecasting matrices.

The reverse operation is provided by matP, which always return an Hermitian matrix.

If an optional keyword argument range is provided, the vectorization concerns only the rows (or columns, since the input matrix is symmetric or Hermitian) in the range. Note that in this case the operation cannot be reverted by the matP, that is, in this case the matrix is 'stuck' in the tangent space.

Examples

using PosDefManifold
P=randP(3)
Q=randP(3)
G=mean(Fisher, P, Q)
# projecting P at the base point given by the geometric mean of P and Q
S=logMap(Fisher, P, G)
# vectorize S
v=vecP(S)
# vectorize onlt the first two columns of S
v=vecP(S; range=1:2)
PosDefManifold.matPFunction
matP(ς::Vector{T}) where T<:RealOrComplex

Matrizize a tangent vector (vector) ς : vec -> mat.

This is the function reversing the vecP function, thus the weighting applied therein is reversed as well.

If $ς=vecP(S)$ and $S$ is a $n⋅n$ Hermitian or Symmetric matrix, $ς$ is a tangent vector of size $n(n+1)/2$. The result of calling matP(ς) is then $n⋅n$ matrix $S$. $S$ is always returned flagged as Hermitian.

To Do: This function may be rewritten more efficiently.

Examples

using PosDefManifold
P=randP(3)
Q=randP(3)
G=mean(Fishr, P, Q)
# projecting P at onto the tangent space at the Fisher mean base point
S=logMap(Fisher, P, G)
# vectorize S
v=vecP(S)
# Rotate the vector by an orthogonal matrix
n=Int(size(S, 1)*(size(S, 1)+1)/2)
U=randP(n)
z=U*v
# Get the point in the tangent space
S=matP(z)
PosDefManifold.parallelTransportFunction
(1) parallelTransport(S::ℍ{T}, P::ℍ{T}, Q::ℍ{T})

(2) parallelTransport(S::ℍ{T}, P::ℍ{T})

(3) parallelTransport(𝐒::ℍVector, P::ℍ{T}, Q::ℍ{T})

(4) parallelTransport(𝐒::ℍVector, P::ℍ{T})
for all the above: where T<:RealOrComplex

alias: pt

(1) Parallel transport of tangent vector $S$ (a matrix) lying on the tangent space at base-point $P$ to the tangent space at base-point $Q$.

$S$, $P$ and $Q$ must all be Hermitian matrices. Return an Hermitian matrix. The transport is defined as:

$∥_{(P→Q)}(S)=\big(QP^{-1}\big)^{1/2}S\big(QP^{-1}\big)^{H/2}$.

If $S$ is a positive definite matrix in the manifold (and not a tangent vector) it will be 'trasported' from $P$ to $Q$, amounting to (Yair et al., 2019🎓)

  • project $S$ onto the tangent space at base-point $P$,
  • parallel transport it to the tangent space at base-point $Q$,
  • project it back onto the manifold at base-point $Q$.

(2) Parallel transport as in (1), but to the tangent space at base-point the identity matrix.

The transport reduces in this case to:

$∥_{(P→I)}(S)=P^{-1/2}SP^{-1/2}$.

(3) Parallel transport as in (1) at once for $k$ tangent vectors (matrices) in 1d array $𝐒={S_1,...,S_k}$ of ℍVector type.

(4) Parallel transport as in (2) at once for $k$ tangent vectors (matrices) in 1d array $𝐒={S_1,...,S_k}$ of ℍVector type.

Nota Bene

Currently only the Fisher metric is supported for parallel transport.

See also: logMap, expMap, vecP, matP.

Examples

using PosDefManifold

(1)
P=randP(3)
Q=randP(3)
G=mean(Fisher, P, Q)

# i. projecting P onto the tangent space at base-point G
S=logMap(Fisher, P, G)
# ii. parallel transport S to the tangent space at base-point Q
S_=parallelTransport(S, G, Q)
# iii. projecting back into the manifold at base-point Q
P_=expMap(Fisher, S_, Q)

# i., ii. and iii. can be done simply by
PP_=parallelTransport(P, G, Q)
# check
P_≈PP_ ? println(" ⭐ ") : println(" ⛔ ")

(2)
P=randP(3)
Q=randP(3)
G=mean(Fisher, P, Q)
# transport to the tangent space at base-point the identity
PP_=parallelTransport(P, G)

(3)
Pset=randP(3, 4)
Q=randP(3)
G=mean(Fisher, Pset)
# trasport at once all matrices in Pset
Pset2=parallelTransport(Pset, G, Q)

(4)
Pset=randP(3, 4)
G=mean(Fisher, Pset)
# recenter all matrices so to have mean=I
Pset2=parallelTransport(Pset, G)
# check
mean(Fisher, Pset2) ≈ I ? println(" ⭐ ") : println(" ⛔ ")

Procrustes problems

FunctionDescription
procrustesSolution to the Procrustes problem in the manifold of positive definite matrices

PosDefManifold.procrustesFunction
procrustes(P::ℍ{T}, Q::ℍ{T}, extremum="min") where T<:RealOrComplex

Given two positive definite matrices $P$ and $Q$, return by default the solution of problem

$\textrm{argmin}_Uδ(P,U^HQU)$,

where $U$ varies over the set of unitary matrices and $δ(.,.)$ is a distance or divergence function.

$U^HQU$ is named in physics the unitary orbit of $Q$.

If the argument extremum is passed as "max", it returns instead the solution of

$\textrm{argmax}_Uδ(P,U^HQU)$.

$P$ and $Q$ must be flagged as Hermitian. See typecasting matrices.

As it has been shown in Bhatia and Congedo (2019)🎓, using each of the Fisher, logdet zero, Wasserstein and the Kullback-Leibler divergence (see logdet α), the best approximant to $P$ from the unitary orbit of $Q$ commutes with $P$ and, surprisingly, has the same closed-form expression, namely

$U_Q^↓U_P^{↓H}$ for the argmin and $U_Q^↑U_P^{↓H}$ for the argmax,

where $U^↓$ denotes the eigenvector matrix of the subscript argument with eigenvectors in columns sorted by decreasing order of corresponding eigenvalues and $U^↑$ denotes the eigenvector matrix of the subscript argument with eigenvectors in columns sorted by increasing order of corresponding eigenvalues.

The same solutions are known since a long time also by solving the extremal problem here above using the Euclidean metric (Umeyama, 1988).

The generalized Procrustes problem

$\textrm{argmin}_Usum_{i=1}^{k}δ(P_i,U^HQ_iU)$

can be solved using Julia package Manopt.

Examples

using PosDefManifold
P=randP(3)
Q=randP(3)
# argmin problem
U=procrustes(P, Q)
# argmax problem
V=procrustes(P, Q, "max")