## CanDecomp.jl: Candecomp/Parafac Tensor Decomposition

**CanDecomp** is a module required by NTFk.
For more information, visit tensors.lanl.gov

### Installation

After starting Julia, execute:

import Pkg; Pkg.add("CanDecomp")

to access the latest released version. To utilize the latest updates (commits) use:

import Pkg; Pkg.add(Pkg.PackageSpec(name="CanDecomp", rev="master"))

## Docker

```
docker run --interactive --tty montyvesselinov/tensors
```

### Examples

A simple problem demonstrating **CanDecomp** can be executed as follows.
First, generate a random CP (Candecomp/Parafac) tensor:

import CanDecomp
A = rand(2, 3)
B = rand(5, 3)
C = rand(10, 3)
T_orig = CanDecomp.totensor(A, B, C)

Then generate random initial guesses for the tensor factors:

Af = rand(size(A)...);
Bf = rand(size(B)...);
Cf = rand(size(C)...);

Execute **CanDecomp** to estimate the tensor factors based on the tensor `T_orig`

only:

import StaticArrays
CanDecomp.candecomp!(StaticArrays.SVector(Af, Bf, Cf), T_orig, Val{:nnoptim}; regularization=1e-3, print_level=0, max_cd_iters=1000)

Construct the estimated tensor:

T_est = CanDecomp.totensor(Af, Bf, Cf);

Compare the estimated and the original tensors:

import NTFk
import LinearAlgebra
@info("Norm $(LinearAlgebra.norm(T_est .- T_orig))")
NTFk.plot2matrices(A, Af; progressbar=nothing)
NTFk.plot2matrices(B, Bf; progressbar=nothing)
NTFk.plot2matrices(C, Cf; progressbar=nothing)
NTFk.plotlefttensor(T, T_est; progressbar=nothing)

### Tensor Decomposition

**NTFk** performs a novel unsupervised Machine Learning (ML) method based on Tensor Decomposition coupled with sparsity and nonnegativity constraints.

**NTFk** has been applied to extract the temporal and spatial footprints of the features in multi-dimensional datasets in the form of multi-way arrays or tensors.

**NTFk** executes the decomposition (factorization) of a given tensor by minimization of the Frobenius norm:

where:

- is the dimensionality of the tensor
- is a "mixing" core tensor
- are "feature” factors (in the form of vectors or matrices)
- is a tensor product applied to fold-in factors in each of the tensor dimensions

The product is an estimate of ().

The reconstruction error is expected to be random uncorrelated noise.

is a -dimensional tensor with a size and a rank lower than the size and the rank of . The size of tensor defines the number of extracted features (signals) in each of the tensor dimensions.

The factor matrices represent the extracted features (signals) in each of the tensor dimensions. The number of matrix columns equals the number of features in the respective tensor dimensions (if there is only 1 column, the particular factor is a vector). The number of matrix rows in each factor (matrix) equals the size of tensor X in the respective dimensions.

The elements of tensor define how the features along each dimension () are mixed to represent the original tensor .

**NTFk** can perform Tensor Decomposition using Candecomp/Parafac (CP) or Tucker decomposition models.

Some of the decomposition models can theoretically lead to unique solutions under specific, albeit rarely satisfied, noiseless conditions. When these conditions are not satisfied, additional minimization constraints can assist the factorization. A popular approach is to add sparsity and nonnegative constraints. Sparsity constraints on the elements of G reduce the number of features and their mixing (by having as many zero entries as possible). Nonnegativity enforces parts-based representation of the original data which also allows the Tensor Decomposition results for and to be easily interrelated Cichocki et al, 2009.

### Publications:

- Vesselinov, V.V., Mudunuru, M., Karra, S., O'Malley, D., Alexandrov, B.S., Unsupervised Machine Learning Based on Non-Negative Tensor Factorization for Analyzing Reactive-Mixing, 10.1016/j.jcp.2019.05.039, Journal of Computational Physics, 2019. PDF
- Vesselinov, V.V., Alexandrov, B.S., O'Malley, D., Nonnegative Tensor Factorization for Contaminant Source Identification, Journal of Contaminant Hydrology, 10.1016/j.jconhyd.2018.11.010, 2018. PDF
- O'Malley, D., Vesselinov, V.V., Alexandrov, B.S., Alexandrov, L.B., Nonnegative/binary matrix factorization with a D-Wave quantum annealer, PlosOne, 10.1371/journal.pone.0206653, 2018. PDF
- Stanev, V., Vesselinov, V.V., Kusne, A.G., Antoszewski, G., Takeuchi,I., Alexandrov, B.A., Unsupervised Phase Mapping of X-ray Diffraction Data by Nonnegative Matrix Factorization Integrated with Custom Clustering, Nature Computational Materials, 10.1038/s41524-018-0099-2, 2018. PDF
- Iliev, F.L., Stanev, V.G., Vesselinov, V.V., Alexandrov, B.S., Nonnegative Matrix Factorization for identification of unknown number of sources emitting delayed signals PLoS ONE, 10.1371/journal.pone.0193974. 2018. PDF
- Stanev, V.G., Iliev, F.L., Hansen, S.K., Vesselinov, V.V., Alexandrov, B.S., Identification of the release sources in advection-diffusion system by machine learning combined with Green function inverse method, Applied Mathematical Modelling, 10.1016/j.apm.2018.03.006, 2018. PDF
- Vesselinov, V.V., O'Malley, D., Alexandrov, B.S., Contaminant source identification using semi-supervised machine learning, Journal of Contaminant Hydrology, 10.1016/j.jconhyd.2017.11.002, 2017. PDF
- Alexandrov, B., Vesselinov, V.V., Blind source separation for groundwater level analysis based on non-negative matrix factorization, Water Resources Research, 10.1002/2013WR015037, 2014. PDF

Research papers are also available at Google Scholar, ResearchGate and Academia.edu

### Presentations:

- Vesselinov, V.V., Physics-Informed Machine Learning Methods for Data Analytics and Model Diagnostics, M3 NASA DRIVE Workshop, Los Alamos, 2019. PDF
- Vesselinov, V.V., Unsupervised Machine Learning Methods for Feature Extraction, New Mexico Big Data & Analytics Summit, Albuquerque, 2019. PDF
- Vesselinov, V.V., Novel Unsupervised Machine Learning Methods for Data Analytics and Model Diagnostics, Machine Learning in Solid Earth Geoscience, Santa Fe, 2019. PDF
- Vesselinov, V.V., Novel Machine Learning Methods for Extraction of Features Characterizing Datasets and Models, AGU Fall meeting, Washington D.C., 2018. PDF
- Vesselinov, V.V., Novel Machine Learning Methods for Extraction of Features Characterizing Complex Datasets and Models, Recent Advances in Machine Learning and Computational Methods for Geoscience, Institute for Mathematics and its Applications, University of Minnesota, 10.13140/RG.2.2.16024.03848, 2018. PDF
- Vesselinov, V.V., Mudunuru. M., Karra, S., O'Malley, D., Alexandrov, Unsupervised Machine Learning Based on Non-negative Tensor Factorization for Analysis of Filed Data and Simulation Outputs, Computational Methods in Water Resources (CMWR), Saint-Malo, France, 10.13140/RG.2.2.27777.92005, 2018. PDF
- O'Malley, D., Vesselinov, V.V., Alexandrov, B.S., Alexandrov, L.B., Nonnegative/binary matrix factorization with a D-Wave quantum annealer PDF
- Vesselinov, V.V., Alexandrov, B.A, Model-free Source Identification, AGU Fall Meeting, San Francisco, CA, 2014. PDF

Presentations are also available at slideshare.net, ResearchGate and Academia.edu

### Videos:

- Progress of nonnegative matrix factorization process:

Videos are also available at YouTube

### Patent:

Alexandrov, B.S., Vesselinov, V.V., Alexandrov, L.B., Stanev, V., Iliev, F.L., Source identification by non-negative matrix factorization combined with semi-supervised clustering, US20180060758A1

For more information, visit monty.gitlab.io

### Examples:

## Installation behind a firewall

Julia uses git for package management. Add in the `.gitconfig`

file in your home directory:

```
[url "https://"]
insteadOf = git://
```

or execute:

```
git config --global url."https://".insteadOf git://
```

Julia uses git and curl to install packages. Set proxies:

```
export ftp_proxy=http://proxyout.<your_site>:8080
export rsync_proxy=http://proxyout.<your_site>:8080
export http_proxy=http://proxyout.<your_site>:8080
export https_proxy=http://proxyout.<your_site>:8080
export no_proxy=.<your_site>
```

For example, if you are doing this at LANL, you will need to execute the following lines in your bash command-line environment:

```
export ftp_proxy=http://proxyout.lanl.gov:8080
export rsync_proxy=http://proxyout.lanl.gov:8080
export http_proxy=http://proxyout.lanl.gov:8080
export https_proxy=http://proxyout.lanl.gov:8080
export no_proxy=.lanl.gov
```