GenerativeTopographicMapping
Documentation for GenerativeTopographicMapping.
GenerativeTopographicMapping.GTM
GenerativeTopographicMapping.GenerativeTopographicMap
GenerativeTopographicMapping.Posterior
GenerativeTopographicMapping.Responsabilities
GenerativeTopographicMapping.estimateLogLikelihood
GenerativeTopographicMapping.fit!
GenerativeTopographicMapping.getCoordsMatrix
GenerativeTopographicMapping.getDMatrix
GenerativeTopographicMapping.getGMatrix
GenerativeTopographicMapping.getUMatrix
GenerativeTopographicMapping.getYMatrix
GenerativeTopographicMapping.get_means
GenerativeTopographicMapping.get_modes
GenerativeTopographicMapping.getΦMatrix
GenerativeTopographicMapping.initWMatrix
GenerativeTopographicMapping.initializeVariance
GenerativeTopographicMapping.initβ⁻¹
GenerativeTopographicMapping.updateBeta!
GenerativeTopographicMapping.updateW!
GenerativeTopographicMapping.GTM
— TypeModule
A model type for constructing a module, based on unknown.jl, and implementing the MLJ model interface.
From MLJ, the type can be imported using
Module = @load Module pkg=unknown
Do model = Module()
to construct an instance with default hyper-parameters.
GenerativeTopographicMapping implements Generative Topographic Mapping, Neural Computation; Bishop, C.; (1998):"GTM: The Generative Topographic Mapping"
Training data
In MLJ or MLJBase, bind an instance model
to data with mach = machine(model, X) where
X
: anAbstractMatrix
orTable
of input features whose columns are of scitypeContinuous.
Train the machine with fit!(mach, rows=...)
.
Hyper-parameters
k=16
: Number of nodes along once side of GTM latent grid. There arek²
total nodes.m=4
: Square root of the number of RBF functions in latent transformation. There arem²
total RBFs.σ=0.3
: Standard deviation for RBF functions in latent transformation.α=0.1
Model weight regularization parameter (0.0 for regularization)tol=0.0001
Tolerance used for determining convergence during expectation-maximization fitting.niter=200
Maximum number of iterations to use.nrepeats=4
Number of steps to repeat at/belowtol
before GTM is considered converged.representation=:means
Method to apply to fitted responsability matrix. One of(:means, :modes)
.
Operations
transform(mach, X)
: returns the coordinates corresponding to mean latent node responsability or mode latent node responsability for each data point. This can be used as a two-dimensional representation of the original datasetX
.
Fitted parameters
The fields of fitted_params(mach)
are:
gtm
: TheGenerativeTopographicMap
object fit by theGTM
model. Contains node coordinates, RbF means, RBF variance, weights, etc.
Report
The fields of report(mach)
are:
classes
: the index of the mode node responsability for each datapoint in X interpreted as a class label
Examples
using MLJ
gtm = @load GTM pkg=GenerativeTopographicMapping
model = gtm()
X, y = make_blob(100, 10; centers=5) # synthetic data
mach = machine(model, X) |> fit!
X̃ = transform(mach, X)
rpt = report(mach)
classes = rpt.classes
GenerativeTopographicMapping.GenerativeTopographicMap
— MethodGTM(k, m, σ, Dataset; α=0.0, tol=0.0001, verbose=false)
Initialize hyperparameters for a GTM model.
k
: square root of the number of latent nodesm
: square root of the number of RBF centers in latent spaceσ
: standard deviation for latent space RBF functionsDataset
: dataset to fit GTM model to. Assumed shape is(n_datapoints, n_features)
α
: Weight regularization parameter (0.0
means no regularization)tol
: absolute tolerance used during fitting.verbose
: Set to true for extra print statements.
GenerativeTopographicMapping.Posterior
— MethodPosterior(gtm::GenerativeTopographicMap)
Compute a matrix of contributions to posterior probabilities. This is an intermediate result to facilitate computation of true posterior probabilities given by the responsability matrix R
. The returned size is (nnodes, ndatapoints). The exp-normalize trick is used for numerical stability.
GenerativeTopographicMapping.Responsabilities
— MethodResponsabilities(gtm::GenerativeTopographicMapping, Dataset)
Compute matrix of responsabilities of each node in X
to datapoints in Dataset
. Return matrix is of size (n_nodes, n_datapoints)
.
GenerativeTopographicMapping.estimateLogLikelihood
— MethodestimateLogLikelihood(gtm::GenerativeTopographicMap, P, Dataset)
Compute the log-likelihood of obtaining our data provided parameters W
and β⁻¹
.
GenerativeTopographicMapping.fit!
— Methodfit!(gtm::GenerativeTopographicMap, Dataset)
Fit an initialized generative topographic map gtm
to a dataset Dataset
.
GenerativeTopographicMapping.getCoordsMatrix
— MethodgetCoordsMatrix(k::Int)
Generate a matrix of k²
node coordinates on a regular grid with $x∈ [-1,1]$ and $y∈[-1, 1]$.
GenerativeTopographicMapping.getDMatrix
— MethodgetDMatrix(gtm::GenerativeTopographicMap, Dataset)
Compute pairwise distances between projected gaussian centers Y
and data points in Dataset
. Resulting size is (n_nodes, n_datapoints)
.
GenerativeTopographicMapping.getGMatrix
— MethodgetGMatrix(R)
Create diagonal matrix G
from responsability matrix R
. Return size is (n_nodes, n_nodes)
.
GenerativeTopographicMapping.getUMatrix
— MethodgetUMatrix(Dataset)
Perform PCA on the Dataset
and return a matrix U
containing the first two principal components (first two columns of data covariance matrix) and the variance of the third principal component. Size of returned matrix U
is (n_features, 2)
GenerativeTopographicMapping.getYMatrix
— MethodgetYMatrix(gtm::GenerativeTopographicMap)
Compute Gaussian centers in data space via Y=W*Φ'
. Return size is (n_features, n_nodes)
.
GenerativeTopographicMapping.get_means
— Methodget_means(R, X)
Compute responsability weighted mean node position: $⟨x|tⱼ, W, β⟩=Σⱼ Rᵢⱼxᵢ$
GenerativeTopographicMapping.get_modes
— Methodget_modes(R, X)
Compute the node corresponding to the mode responsability for each data point.
GenerativeTopographicMapping.getΦMatrix
— MethodgetΦMatrix(X, M, σ²)
Given a matrix of latent node coordinates X
, RBF mean coordinates M
, and variance σ²
, return a matrix Φ
of dimension (n_nodes, n_rbf_centers+1)
. The final column is set to 1.0
to include a bias offset in addition to the RBFs.
GenerativeTopographicMapping.initWMatrix
— MethodinitWMatrix(X,Φ,U)
Initialize parameter matrix W
. Initial weights are chosen so that WΦ'
reproduces PCA projections such that WΦ' ≈ UX'
GenerativeTopographicMapping.initializeVariance
— MethodinitializeVariance(σ::Float64, M::)
Initilize RBF variance by combining supplied standard deviation with minimum RBF mean distance.
GenerativeTopographicMapping.initβ⁻¹
— Methodinitβ⁻¹(β⁻¹, Y)
Initialized β⁻¹ using our first guess for β⁻¹ (from 3rd principal component variance) and the mean distance between projected rbf centers in data space.
GenerativeTopographicMapping.updateBeta!
— MethodupdateBeta(R, D)
GenerativeTopographicMapping.updateW!
— MethodupdateW!(gtm::GenerativeTopographicMap, Dataset)
Update model weights using responsability matrix.