CompressiveLearning.jl

A Julia package for compressive (aka sketched) learning, i.e. to perform efficiently large-scale learning tasks by compressing the whole dataset into a single small vector of generalized random moments. Currently supported learning tasks are clustering, PCA, and Gaussian modeling using Gaussian distributions with diagonal covariances.

Supported features

Sketching with: -- random/quantized/squared Fourier features -- Nyström features with uniform, approximate leverage score (ALS), and DPP sampling.
Generic CLOMPR function to solve the inverse problem (learn from the sketch) [5].
Practical implementation of CLOMPR for clustering (working well [6]) and GMM fitting (not extensively tested).
CL-AMP algorithm (for clustering with random features only) [3].
Various optimization routines for PCA.
Fast Transforms for high-dimensional settings [2] (at least for k-means/PCA).
Differentially-private sketching operators [4].

Installation

To install the package, run (] add CompressiveLearning from the Julia REPL).

Some functionalities of the package rely on additional dependencies:

Nyström features with leverage score sampling relies on the BLESS python library (https://github.com/LCSL/bless). PyCall must be manually loaded, and the python dependencies sklearn and numpy installed for the python version used by PyCall. (BLESS itself is already included.)
Nyström features with DPP sampling requires PyCall and python dependencies dppy, sklearn and falkon (to install from https://github.com/FalkonML/falkon).

Usage

Main wrappers (see respective docstrings) are CKM, CPCA, CGMM. All of these methods support the keyword arguments of skops_pair, which allow to control usage of differential privacy, fast transforms, choice of the kernel etc. The most critical hyperparameter is the kernel_variance (which should ideally be chosen of the order of the squared minimum inter-cluster distance). Main algorithms to learn from the sketch: CLOMPR (continuous orthogonal matching pursuit, more stable) and CLAMP (approximate message passing, for clustering only, works with smaller sketch sizes but more unstable).

[1] Efficient and Privacy-Preserving Compressive Learning
    2020, PhD Thesis, Université de Rennes 1
    Antoine Chatalic

[2] Large-Scale High-Dimensional Clustering with Fast Sketching,
    2018, ICASSP
    A. Chatalic, R. Gribonval, N. Keriven

[3] Sketched Clustering via Hybrid Approximate Message Passing
    2019, IEEE Transactions on Signal Processing
    E. Byrne, A. Chatalic, R. Gribonval, P. Schniter

[4] Differentially Private Compressive K-Means
    2019, ICASSP.
    V. Schellekens, A. Chatalic, F. Houssiau, Y.-A. De Montjoye, L. Jacques, R. Gribonval

[5] Sketching for Large-Scale Learning of Mixture Models
    2017, Information and Inference: A Journal of the IMA
    N. Keriven, A. Bourrier, R. Gribonval, P. Pérez

[6] Compressive K-Means
    N. Keriven, N. Tremblay, Y. Traonmilin, R. Gribonval
    2017, ICASSP

TODOs

Set up documentation online

CompressiveLearning.jl

Supported features

Installation

Usage

Documentation

Tests

Support

Relevant publications

TODOs