Base type for all RBMs. Takes type parameters:

  • T - type of RBM parameters (weights, biases, etc.), input and output. By default, Float64 is used
  • V - type of visible units
  • H - type of hidden units

Distribution with a single possible value. Used e.g. during sampling to provide stable result equal to provided means:

sample(Degenerate, means) = means


Restricted Boltzmann Machine, parametrized by element type T, visible unit type V and hidden unit type H.


Construct RBM. Parameters:

  • T - type of RBM parameters (e.g. weights and biases; by default, Float64)
  • V - type of visible units
  • H - type of hidden units
  • n_vis - number of visible units
  • n_hid - number of hidden units

Optional parameters:

  • sigma - variance to use during parameter initialization

Given trained RBM and sample of visible data, generate similar items


Base data X through trained RBM to obtain compressed representation


Get weight matrix of a trained RBM. Options:

  • transpose - boolean, whether to transpose weight matrix (convenient) or not (efficient). Default: true

Fit RBM to data X. Options that can be provided in the opts dictionary:

  • :n_epochs - number of full loops over data (default: 10)
  • :batch_size - size of mini-batches to use (default: 100)
  • :randomize - boolean, whether to shuffle batches or not (default: false)
  • :gradient - function to use for calculating parameter gradients (default: gradient_classic)
  • :update - function to use to update weights using calculated gradient (default: update_classic!)
  • :scorer - function to calculate how good the model is at the moment (default: pseudo_likelihood)
  • :reporter - type for reporting intermediate results using report() function (default: TextReporter)

Each function can additionally take other options, see their docstrings/code for details.

NOTE: this function is incremental, so one can, for example, run it for 10 epochs, then inspect the model, then run it for 10 more epochs and check the difference.


The TestReporter uses the ratio test or D'Alembert's criterion to monitor convergence. This is helpful in automating tests that confirm that a given RBM is learning something.

NOTE: current we just store the calculated ratios from each epoch, but we could probably just do an online mean calculation.


Contrastive divergence sampler. Options:

  • n_gibbs - number of gibbs sampling loops (default: 1)

Simply return true or false as to whether the mean ratio is less than 1


Generates synthetic random datasets with several modifiable properties.


  • T::Type - the type (or precision) of the resulting matrix
  • n_features::Int - the number of features in each observation


  • n_classes::Int - the number of unique classes or categories in the resulting dataset. (default=10)
  • n_obs::Int - total number of observations in resulting dataset. (default=1000)
  • sparsity::Float64 - specifies the density if a sparse matrix is desired. (default=-1.0) if less than 0.0 a dense matrix is created.
  • binary::Bool - whether or not to round the result to 0.0 or 1.0


  • Mat{T}(nfeatures, nobs) - with the various properties specified.

Function for calculating gradient of negative log-likelihood of the data. Options:

  • :sampler - sampler to use (default: persistent_contdiv)


  • (dW, db, dc) - tuple of gradients for weights, visible and hidden biases, respectively

Computes and adds a new ratio to the reporter. If log is set to true in the reporter a message containing the score, time and mean ratio will be printed.


Runs simple smoke tests on the provided RBM.


  • rbm::AbstractRBM - the rbm to test


  • opts::Dict - an alternate context to use when calling fit. (default=DEFAULT_CONTEXT)
  • n_obs::Int - the number of observations to generate for the synthetic datasets. (default=1000)
  • debug::Bool - whether or not to print each epoch. (default=false) Only applies if the reporter in ctx is TestReporter

NOTE: Only using dense arrays for the dataset cause the conditional rbm doesn't support sparse ones yet.


tofinite! takes an array and

  1. turns all NaNs to zeros
  2. turns all Infs and -Infs to the largets and smallest representable values accordingly.
  3. turns all zeros to the smallest representable non-zero values, if nozeros is true

Update RBM parameters using provided tuple dtheta = (dW, db, dc) of parameter gradients. Before updating weights, following transformations are applied to gradients:

  • learning rate (see grad_apply_learning_rate! for details)
  • momentum (see grad_apply_momentum! for details)
  • weight decay (see grad_apply_weight_decay! for details)
  • sparsity (see grad_apply_sparsity! for details)

Same as get function, but evaluates default_expr only if needed


Get array of size sz from a dict by key. If element doesn't exist or its size is not equal to sz, create and return new array using default_expr. If element exists, but is not an error, throw ArgumentError.


Same as @get, but immediately exits function and return default_expr if key doesn't exist.


If loaded twice without changes, evaluate expression only for the first time. This is useful for reloading code in REPL. For example, the following code will produce invalid redifinition error if loaded twice:

type Point{T}

Wrapped into @runonce, however, the code is reloaded fine:

@runonce type Point{T}

@runonce doesn't have any affect on expression itself.