Docstrings · BoltzmannMachines.jl

BoltzmannMachines.AbstractOptimizer — Type

The AbstractOptimizer interface allows to specify optimization procedures. It consists of three methods:

initialized(optimizer, bm): May be used for creating an optimizer that is specifically initialized for the Boltzmann machine bm. In particular it may be used to allocate reusable space for the gradient. The default implementation simply returns the unmodified optimizer.
computegradient!(optimizer, v, vmodel, h, hmodel, rbm) or computegradient!(optimizer, meanfieldparticles, gibbsparticles, dbm) needs to be implemented for computing the gradient given the samples from the positive and negative phase.
updateparameters!(bm, optimizer) needs to be specified for taking the gradient step. The default implementation for RBMs expects the fields learningrate and gradient and adds learningrate * gradient to the given RBM.

BoltzmannMachines.AbstractRBM — Type

Abstract supertype for all RBMs

BoltzmannMachines.AbstractTrainLayer — Type

Abstract supertype for layerwise training specification. May be specifications for a normal RBM layer (see TrainLayer) or multiple combined specifications for a partitioned layer (see TrainPartitionedLayer).

BoltzmannMachines.AbstractXBernoulliRBM — Type

Abstract super type for RBMs with binary and Bernoulli distributed hidden nodes.

BoltzmannMachines.BernoulliGaussianRBM — Type

BernoulliGaussianRBM(weights, visbias, hidbias)

Encapsulates the parameters of an RBM with Bernoulli distributed visible nodes and Gaussian distributed hidden nodes. The standard deviation of the Gaussian distribution is 1.

BoltzmannMachines.BernoulliRBM — Type

BernoulliRBM(weights, visbias, hidbias)

Encapsulates the parameters of an RBM with Bernoulli distributed nodes.

weights: matrix of weights with size (number of visible nodes, number of hidden nodes)
visbias: bias vector for visible nodes
hidbias: bias vector for hidden nodes

BoltzmannMachines.Binomial2BernoulliRBM — Type

Binomial2BernoulliRBM(weights, visbias, hidbias)

Encapsulates the parameters of an RBM with 0/1/2-valued, Binomial (n=2) distributed visible nodes, and Bernoulli distributed hidden nodes. This model is equivalent to a BernoulliRBM in which every two visible nodes are connected with the same weights to each hidden node. The states (0,0) / (1,0) / (0,1) / (1,1) of the visible nodes connected with with the same weights translate as states 0 / 1 / 1 / 2 in the Binomial2BernoulliRBM.

BoltzmannMachines.DataDict — Type

A dictionary containing names of data sets as keys and the data sets (matrices with samples in rows) as values.

BoltzmannMachines.GaussianBernoulliRBM — Type

GaussianBernoulliRBM(weights, visbias, hidbias, sd)

Encapsulates the parameters of an RBM with Gaussian distributed visible nodes and Bernoulli distributed hidden nodes.

BoltzmannMachines.GaussianBernoulliRBM2 — Type

GaussianBernoulliRBM2(weights, visbias, hidbias, sd)

Encapsulates the parameters of an RBM with Gaussian distributed visible nodes and Bernoulli distributed hidden nodes with the alternative energy formula proposed by KyungHyun Cho.

BoltzmannMachines.IntensityTransformation — Type

Encapsulates all data needed to transform a vector of values into the interval [0.0, 1.0] by a linear and monotonous transformation.

See intensities_encode, intensities_decode for the usage.

BoltzmannMachines.LoglikelihoodOptimizer — Type

Implements the AbstractOptimizer interface for optimizing the loglikelihood with stochastic gradient descent.

BoltzmannMachines.Monitor — Type

A vector for collecting MonitoringItems during training.

BoltzmannMachines.MonitoringItem — Type

Encapsulates the value of an evaluation calculated in one training epoch. If the evaluation depends on a dataset, the dataset's name can be specified also.

BoltzmannMachines.MultivariateBernoulliDistribution — Type

MultivariateBernoulliDistribution(bm)

Calculates and stores the probabilities for all possible combinations of a multivariate Bernoulli distribution defined by a Boltzmann machine model with Bernoulli distributed visible nodes. Can be used for sampling from this distribution, see samples.

BoltzmannMachines.NoRBM — Type

Singleton-Placeholder for AbstractRBMs

BoltzmannMachines.Particles — Type

Particles are an array of matrices. The i'th matrix contains in each row the vector of states of the nodes of the i'th layer of an RBM or a DBM. The set of rows with the same index define an activation state in a Boltzmann Machine. Therefore, the size of the i'th matrix is (number of samples/particles, number of nodes in layer i).

BoltzmannMachines.PartitionedBernoulliDBM — Type

A DBM with only Bernoulli distributed nodes which may contain partitioned layers.

BoltzmannMachines.PartitionedRBM — Type

PartitionedRBM(rbms)

Encapsulates several (parallel) AbstractRBMs that form one partitioned RBM. The nodes of the parallel RBMs are not connected between the RBMs.

BoltzmannMachines.StackedOptimizer — Type

StackedOptimizer(optimizers)

Can be used for optimizing a stack of RBMs / a DBM by using the given the vector of optimizers (one for each RBM). For more information about the concept of optimizers, see AbstractOptimizer.

BoltzmannMachines.TrainLayer — Method

Specify parameters for training one RBM-layer in a DBM.

Optional keyword arguments:

The optional keyword arguments rbmtype, nhidden, epochs, learningrate/learningrates, sdlearningrate/sdlearningrates, categories, batchsize, pcd, cdsteps, startrbm and optimizer/optimizers are passed to fitrbm. For a detailed description, see there. If a negative value is specified for learningrate or epochs, this indicates that a corresponding default value should be used (parameter defined by call to stackrbms).
monitoring: also like in fitrbm, but may take a DataDict as third argument (see function stackrbms and its argument monitoringdata).
nvisible: Number of visible units in the RBM. Only relevant for partitioning. This parameter is derived as much as possible by stackrbms. For MultimodalDBMs with a partitioned first layer, it is necessary to specify the number of visible nodes for all but at most one partition in the input layer.

BoltzmannMachines.TrainPartitionedLayer — Type

Encapsulates a vector of TrainLayer objects for training a partitioned layer.

BoltzmannMachines.aislogimpweights — Method

aislogimpweights(rbm; ...)

Computes the logarithmised importance weights for estimating the ratio of the partition functions of the given rbm to the RBM with zero weights, but same visible and hidden bias as the rbm. This function implements the Annealed Importance Sampling algorithm (AIS) like described in section 4.1.3 of [Salakhutdinov, 2008].

Optional keyword arguments (for all types of Boltzmann Machines):

ntemperatures: Number of temperatures for annealing from the starting model to the target model, defaults to 100
temperatures: Vector of temperatures. By default ntemperatures ascending numbers, equally spaced from 0.0 to 1.0
nparticles: Number of parallel chains and calculated weights, defaults to 100
burnin: Number of steps to sample for the Gibbs transition between models

BoltzmannMachines.aislogimpweights — Method

aislogimpweights(dbm; ...)

Computes the logarithmised importance weights in the Annealed Importance Sampling algorithm (AIS) for estimating the ratio of the partition functions of the given DBM dbm to the base-rate DBM with all weights being zero and all biases equal to the biases of the dbm.

Implements algorithm 4 in [Salakhutdinov+Hinton, 2012]. For DBMs with Bernoulli-distributed nodes only (i. e. here DBMs of type PartitionedBernoulliDBM), it is possible to calculate the importance weights by summing out either the even layers (h1, h3, ...) or the odd layers (v, h2, h4, ...). In the first case, the nodes' activations in the odd layers are used to calculate the probability ratios, in the second case the even layer are used. If dbm is of type PartitionedBernoulliDBM, the optional keyword argument sumout can be used to choose by specifying the values :odd (default) or :even. In the case of MultimodalDBMs, it is not possible to choose and the second case applies there.

BoltzmannMachines.aislogimpweights — Method

aislogimpweights(rbm1, rbm2; ...)

Computes the logarithmised importance weights for estimating the log-ratio log(Z2/Z1) for the partition functions Z1 and Z2 of rbm1 and rbm2, respectively. Implements the procedure described in section 4.1.2 of [Salakhutdinov, 2008]. This requires that rbm1 and rbm2 are of the same type and have the same number of visible units.

BoltzmannMachines.aisprecision — Function

aisprecision(logr, aissd, sdrange)

Returns the differences of the estimated logratio r to the lower and upper bound of the range defined by the multiple sdrange of the standard deviation of the ratio's estimator aissd.

BoltzmannMachines.aisprecision — Function

aisprecision(logimpweights, sdrange)

BoltzmannMachines.aisstandarddeviation — Method

Computes the standard deviation of the AIS estimator (not logarithmised) (eq 4.10 in [Salakhutdinov+Hinton, 2012]) given the logarithmised importance weights.

BoltzmannMachines.aisupdatelogimpweights! — Method

Updates the logarithmized importance weights logimpweights in AIS by adding the log ratio of unnormalized probabilities of the states of the odd layers in the PartitionedBernoulliDBM dbm. The activation states of the DBM's nodes are given by the particles. For performance reasons, the biases are specified separately.

BoltzmannMachines.akaikeinformationcriterion — Method

akaikeinformationcriterion(bm, loglikelihood)

Calculates the Akaike information criterion for a Boltzmann Machine, given its loglikelihood.

BoltzmannMachines.barsandstripes — Method

barsandstripes(nsamples, nvariables)

Generates a test data set. To see the structure in the data set, run e. g. reshape(barsandstripes(1, 16), 4, 4) a few times.

Example from: MacKay, D. (2003). Information Theory, Inference, and Learning Algorithms

BoltzmannMachines.batchparallelized — Method

batchparallelized(f, n, op)

Distributes the work for executing the function f n times on all the available workers and reduces the results with the operator op. f is a function that gets a number (of tasks) to execute the tasks.

Example:

batchparallelized(n -> aislogimpweights(dbm; nparticles = n), 100, vcat)

BoltzmannMachines.bayesianinformationcriterion — Method

bayesianinformationcriterion(bm, nvariables, loglikelihood)

Calculates the Akaike information criterion for a Boltzmann machine, given its loglikelihood and the number of samples nsamples.

BoltzmannMachines.bernoulliloglikelihoodbaserate — Method

bernoulliloglikelihoodbaserate(nvariables)

Calculates the log-likelihood for a random sample in the "base-rate" BM with all parameters being zero and thus all visible units being independent and Bernoulli distributed.

BoltzmannMachines.bernoulliloglikelihoodbaserate — Method

bernoulliloglikelihoodbaserate(x)

Calculates the log-likelihood for the data set x in the "base-rate" BM with all weights being zero and visible bias set to the empirical probability of the samples' components in x being 1.

BoltzmannMachines.blocksinnoise — Method

blocksinnoise(nsamples, nvariables; ...)

Produces an artificial data set where there are sequences of consecutive 1s in half of the binary samples. The samples are labeled whether they belong one of the nblocks. The samples are otherwise randomly generated with Bernoulli distributed noise. The first return value is a matrix that has nsamples rows and nvariables columns containing zeros and ones as values. The second return value is a vector of labels.

Optional named arguments:

noise: Bernoulli distributed noise (probability for 1s)
blocklen: length of sequences of consecutive variables set to 1 in subgroups of samples
nblocks: number of different locations for sequences, i.e. the number of different subgroups with sequences of 1s

BoltzmannMachines.combined_monitoring — Method

Return a new function that does all the monitoring using the monitoring function (or functions) and the monitoringdata and stores the result in the given monitor.

BoltzmannMachines.combinedbiases — Method

combinedbiases(dbm)

Returns a vector containing in the i'th element the bias vector for the i'th layer of the dbm. For intermediate layers, visible and hidden biases are combined to a single bias vector.

BoltzmannMachines.computegradient! — Method

computegradient!(optimizer, v, vmodel, h, hmodel, rbm)

Computes the gradient of the RBM rbm given the the hidden activation h induced by the sample v and the vectors vmodel and hmodel generated by sampling from the model. The result is stored in the optimizer in such a way that it can be applied by a call to updateparameters!. There is no return value.

For RBMs (excluding PartitionedRBMs), this means saving the gradient in a RBM of the same type in the field optimizer.gradient.

BoltzmannMachines.converttomostspecifictype — Method

Converts a vector to a vector of the most specific type that all elements share as common supertype.

BoltzmannMachines.copyannealed! — Method

copyannealed!(annealedrbm, rbm, temperature)

Copies all parameters that are to be annealed from the RBM rbm to the RBM annealedrbm and anneals them with the given temperature.

BoltzmannMachines.correlations — Method

correlations(datadict)

Creates and returns a dictionary with the same keys as the given datadict. The values of the returned dictionary are the correlations of the samples in the datasets given as values in the datadict.

BoltzmannMachines.crossvalidation — Method

crossvalidation(x, monitoredfit; ...)

Performs k-fold cross-validation, given

the data set x and
monitoredfit: a function that fits and evaluates a model. As arguments it must accept:
- a training data data set
- a DataDict containing the evaluation data.

The return values of the calls to the monitoredfit function are concatenated with vcat. If the monitoredfit function returns Monitor objects, crossvalidation returns a combined Monitor object that can be displayed by creating a cross-validation plot via BoltzmannMachinesPlots.crossvalidationplot.

Optional named argument:

kfold: specifies the k in "k-fold" (defaults to 10).
crossvalidation(x, monitoredfit, pars; ...)

If additionaly a vector of parameters pars is given, monitoredfit also expects an additional parameter from the parameter set.

BoltzmannMachines.crossvalidationargs — Method

crossvalidationargs(x, pars...; )

Returns a tuple of argument vectors containing the parameters for a function such as the monitoredfit argument in crossvalidation.

Usage example: map(monitoredfit, crossvalidationargs(x)...)

Optional named argument:

kfold: see crossvalidation.

BoltzmannMachines.curvebundles — Method

curvebundles(...)

Generates an example dataset that can be visualized as bundles of trend curves with added noise. Additional binary columns with labels may be added.

Optional named arguments:

nbundles: number of bundles
nperbundle: number of sequences per bundle
nvariables: number of variables in the sequences
noisesd: standard deviation of the noise added on all sequences
addlabels: add leading columns to the resulting dataset, specifying the membership to a bundle
pbreak: probability that an intermediate point in a sequence is a breakpoint, defaults to 0.2.
breakval: a function that expects no input and generates a single (random) value for a defining point of a piecewise linear sequence. Defaults to rand.

Example:

To quickly grasp the idea, plot generated samples against the variable index:

x = BMs.curvebundles(nvariables = 10, nbundles = 3,
                   nperbundle = 4, noisesd = 0.03,
                   addlabels = true)
BoltzmannMachinesPlots.plotcurvebundles(x)

BoltzmannMachines.empiricalloglikelihood — Method

empiricalloglikelihood(x, xgen)
empiricalloglikelihood(bm, x, nparticles)
empiricalloglikelihood(bm, x, nparticles, burnin)

Computes the mean empirical loglikelihood for the data set x. The probability of a sample is estimated to be the empirical probability of the sample in a dataset generated by the model. This data set can be given as xgen or it is generated by running a Gibbs sampler with nparticles for burnin steps (default 5) in the Boltzmann Machine bm. Throws an error if a sample in x is not contained in the generated data set.

BoltzmannMachines.emptyfunc — Method

A function accepting everything, doing nothing. Usable as default argument for functions as arguments.W

BoltzmannMachines.energy — Method

energy(rbm, v, h)

Computes the energy of the configuration of the visible nodes v and the hidden nodes h, specified as vectors, in the rbm.

BoltzmannMachines.energyzerohiddens — Method

energyzerohiddens(rbm, v)

Computes the energy for the visible activations v in the RBM rbm, if all hidden nodes have zero activation, i. e. yields the same as energy(rbm, v, zeros(rbm.hidbias)).

BoltzmannMachines.exactloglikelihood — Function

exactloglikelihood(rbm, x)

Computes the mean log-likelihood for the given dataset x and the RBM rbm exactly. The log of the partition function is computed exactly by exactlogpartitionfunction(rbm). Besides that, the function simply calls loglikelihood(rbm, x).

BoltzmannMachines.exactloglikelihood — Function

exactloglikelihood(dbm, x)
exactloglikelihood(dbm, x, logz)

Computes the mean log-likelihood for the given dataset x and the DBM dbm exactly. If the value of the log of the partition function of the dbm is not supplied as argument logz, it will be computed by exactlogpartitionfunction(dbm).

BoltzmannMachines.exactlogpartitionfunction — Method

exactlogpartitionfunction(bgrbm)

Calculates the log of the partition function of the BernoulliGaussianRBM bgrbm exactly. The execution time grows exponentially with the number of visible nodes.

BoltzmannMachines.exactlogpartitionfunction — Method

exactlogpartitionfunction(rbm)

Calculates the log of the partition function of the BernoulliRBM rbm exactly. The execution time grows exponentially with the minimum of (number of visible nodes, number of hidden nodes).

BoltzmannMachines.exactlogpartitionfunction — Method

exactlogpartitionfunction(gbrbm)

Calculates the log of the partition function of the GaussianBernoulliRBM gbrbm exactly. The execution time grows exponentially with the number of hidden nodes.

BoltzmannMachines.exactlogpartitionfunction — Method

exactlogpartitionfunction(mdbm)

Calculates the log of the partition function of the MultimodalDBM mdbm exactly. The execution time grows exponentially with the total number of nodes in hidden layers with odd indexes (i. e. h1, h3, ...).

BoltzmannMachines.exactlogpartitionfunction — Method

exactlogpartitionfunction(dbm)

Calculates the log of the partition function of the DBM dbm exactly. If the number of hidden layers is even, the execution time grows exponentially with the total number of nodes in hidden layers with odd indexes (i. e. h1, h3, ...). If the number of hidden layers is odd, the execution time grows exponentially with the minimum of (number of nodes in layers with even index, number of nodes in layers with odd index).

BoltzmannMachines.fitdbm — Method

fitdbm(x; ...)

Fits a (multimodal) DBM to the data set x. The procedure consists of two parts: First a stack of RBMs is pretrained in a greedy layerwise manner (see stackrbms(x)). Then the weights of all layers are jointly trained using the general Boltzmann Machine learning procedure (fine tuning, see traindbm!(dbm,x)).

Optional keyword arguments (ordered by importance):

nhiddens: vector that defines the number of nodes in the hidden layers of the DBM. The default value specifies two hidden layers with the same size as the visible layer.
epochs: number of training epochs for joint training, defaults to 10
epochspretraining: number of training epochs for pretraining, defaults to epochs
learningrate: learning rate for pretraining. Also used as initial value for the decaying fine tuning learning rate.
learningratepretraining: learning rate for pretraining, defaults to learningrate
learningratefinetuning: initial learning rate for fine tuning. The learning rate for fine tuning is decaying with the number of epochs, starting with the given value for the learningratefinetuning or the learningrate. (For more details see traindbm!.)
learningratesfinetuning: The learning rate for fine tuning is by default decaying with the number of epochs, starting with the value of the learningrate. (For more details see traindbm!.) The value of the learning rate for each epoch of fine tuning can be specified via the argument learningratesfinetuning as a vector with an entry for each of the epochs.
learningrates: deprecated, otherwise equivalent to learningratesfinetuning
batchsize: number of samples in mini-batches for pretraining and fine tuning. By default, a batchsize of 1 is used for pretraining. For fine tuning, no mini-batches are used by default, which means that the complete data set is used for calculating the gradient in each epoch.
batchsizepretraining: batchsize for pretraining, defaults to 1
batchsizefinetuning: batchsize for fine tuning. Defaults to the number of samples in the data set, i.e., no mini batches are used.
nparticles: number of particles used for sampling during joint training of DBM, default 100
pretraining: The arguments for layerwise pretraining can be specified for each layer individually. This is done via a vector of TrainLayer objects. (For a detailed description of the possible parameters, see help for TrainLayer). If the number of training epochs and the learning rate are not specified explicitly for a layer, the values of epochspretraining, learningratepretraining and batchsizepretraining are used.
monitoring: Monitoring function accepting a dbm and the number of epochs, returning nothing. Used for the monitoring of fine-tuning. See also monitored_fitdbm for a more convenient way of monitoring.
monitoringdatapretraining: a DataDict that contains data used for monitoring the pretraining (see argument monitoringdata of stackrbms.)
optimizer/optimizers: an optimizer or a vector of optimizers for each epoch (see AbstractOptimizer) used for fine-tuning.
optimizerpretraining: an optimizer used for pre-training. Defaults to the optimizer.

BoltzmannMachines.fitrbm — Method

fitrbm(x; ...)

Fits an RBM model to the data set x, using Stochastic Gradient Descent (SGD) with Contrastive Divergence (CD), and returns it.

Optional keyword arguments (ordered by importance):

rbmtype: the type of the RBM that is to be trained This must be a subtype of AbstractRBM and defaults to BernoulliRBM.
nhidden: number of hidden units for the returned RBM
epochs: number of training epochs
learningrate/learningrates: The learning rate for the weights and biases can be specified as single value, used throughout all epochs, or as a vector of learningrates that contains a value for each epoch. Defaults to 0.005.
batchsize: number of samples that are used for making one step in the stochastic gradient descent optimizer algorithm. Default is 1.
pcd: indicating whether Persistent Contrastive Divergence (PCD) is to be used (true, default) or simple CD that initializes the Gibbs Chain with the training sample (false)
cdsteps: number of Gibbs sampling steps for (persistent) contrastive divergence, defaults to 1
monitoring: a function that is executed after each training epoch. It takes an RBM and the epoch as arguments. See also monitored_fitrbm for another way of monitoring.
categories: only relevant if rbmtype = Softmax0BernoulliRBM. The number of categories as Int, if all variables have the same number of categories, or as Vector{Int} that contains the number of categories of the i'th categorical variable in the i'th entry.
upfactor, downfactor: If this function is used for pretraining a part of a DBM, it is necessary to multiply the weights of the RBM with factors.
sdlearningrate/sdlearningrates: learning rate(s) for the standard deviation if training a GaussianBernoulliRBM or GaussianBernoulliRBM2. Ignored for other types of RBMs. It usually must be much smaller than the learning rates for the weights. By default it is 0.0, which means that the standard deviation is not learned.
startrbm: start training with the parameters of the given RBM. If this argument is specified, nhidden and rbmtype are ignored.
optimizer/optimizers: an object of type AbstractOptimizer or a vector of them for each epoch. If specified, the optimization is performed as implemented by the given optimizer type. By default, the LoglikelihoodOptimizer with the learningrate/learningrates and sdlearningrate/sdlearningrates is used. For other types of optimizers, the learning rates must be specified in the optimizer. For more information on how to write your own optimizer, see AbstractOptimizer.

See also: monitored_fitrbm for a convenient monitoring of the training.

BoltzmannMachines.freeenergy — Method

freeenergy(rbm, x)

Computes the average free energy of the samples in the dataset x for the AbstractRBM rbm.

BoltzmannMachines.freeenergy — Method

freeenergy(rbm, v)

Computes the free energy of the sample v (a vector) for the rbm.

BoltzmannMachines.freeenergydiffs — Method

freeeenergydiffs(rbm1, rbm2, x)

Computes the differences of the free energy for the samples in the dataset x regarding the RBM models rbm1 and rbm2. Returns a vector of differences.

BoltzmannMachines.gaussianloglikelihoodbaserate — Method

gaussianloglikelihoodbaserate(x)

Calculates the mean log-likelihood for the data set x with all variables and components of the variables being independent and Gaussian distributed. The standard deviation and the mean of the i'th variable is the mean and standard deviation of values of the i'th component of the sample vectors.

BoltzmannMachines.gibbssample! — Function

gibbssample!(particles, bm, nsteps)

Performs Gibbs sampling on the particles in the Boltzmann machine model bm for nsteps steps. (See also: Particles.) When sampling in multimodal deep Boltzmann machines, in-between layers are assumed to contain only Bernoulli-distributed nodes.

BoltzmannMachines.gibbssamplecond! — Function

gibbssamplecond!(particles, bm, cond, nsteps)

Conditional Gibbs sampling on the particles in the bm for nsteps Gibbs sampling steps.

The variables that are marked in the indexing vector cond are fixed to the initial values in particles during sampling. This way, conditional sampling is performed on these variables.

See also: Particles, initparticles

BoltzmannMachines.hiddeninput! — Method

hiddeninput!(h, rbm, v)

Like hiddeninput, but stores the returned result in h.

BoltzmannMachines.hiddeninput — Method

hiddeninput(rbm, v)

Computes the total input of the hidden units in the AbstractRBM rbm, given the activations of the visible units v. v may be a vector or a matrix that contains the samples in its rows.

BoltzmannMachines.hiddenpotential! — Method

hiddenpotential!(hh, rbm, vv)
hiddenpotential!(hh, rbm, vv, factor)

Like hiddenpotential, but stores the returned result in hh.

BoltzmannMachines.hiddenpotential — Method

hiddenpotential(rbm, v)
hiddenpotential(rbm, v, factor)

Returns the potential for activations of the hidden nodes in the AbstractRBM rbm, given the activations v of the visible nodes. v may be a vector or a matrix that contains the samples in its rows. The potential is a deterministic value to which sampling can be applied to get the activations. In RBMs with Bernoulli distributed hidden units, the potential of the hidden nodes is the vector of probabilities for them to be turned on.

The total input can be scaled with the factor. This is needed when pretraining the rbm as part of a DBM.

BoltzmannMachines.initcombination — Method

Returns particle for DBM, initialized with zeros.

BoltzmannMachines.initcombinationoddlayersonly — Method

initcombinationoddlayersonly(dbm)

Creates and zero-initializes a particle for layers with odd indexes in the dbm.

BoltzmannMachines.initialized — Method

initialized(optimizer, rbm)

Returns an AbstractOptimizer similar to the given optimizer that can be used to optimize the AbstractRBM rbm.

BoltzmannMachines.initparticles — Method

initparticles(bm, nparticles; biased = false)

Creates particles for Gibbs sampling in an Boltzmann machine bm. (See also: Particles)

For Bernoulli distributed nodes, the particles are initialized with Bernoulli(p) distributed values. If biased == false, p is 0.5, otherwise the results of applying the sigmoid function to the bias values are used as values for the nodes' individual p's.

Gaussian nodes are sampled from a normal distribution if biased == false. If biased == true the mean of the Gaussian distribution is shifted by the bias vector and the standard deviation of the nodes is used for sampling.

BoltzmannMachines.initrbm — Function

initrbm(x, nhidden)
initrbm(x, nhidden, rbmtype)

Creates a RBM with nhidden hidden units and initalizes its weights for training on dataset x. rbmtype can be a subtype of AbstractRBM, default is BernoulliRBM.

BoltzmannMachines.initvisiblebias — Method

initvisiblebias(x)

Returns sensible initial values for the visible bias for training an RBM on the data set x.

BoltzmannMachines.intensities — Function

intensities(x)
intensities(x, q1)
intensities(x, q1, q2)

Performs a linear and monotonous transformation on the data set x to fit it the values into the interval [0.0, 1.0]. For more information see intensities_encode, intensities_decode.

BoltzmannMachines.intensities_decode — Method

intensities_decode(x, its)

Backtransforms the intensity values in the data set x (values in the interval [0.0, 1.0])to the range of the original values and returns the new data set or vector. Theitsargument contains the information about the transformation, as it is returned byintensities_encode`.

Note that the range is truncated if the original transformation used other quantiles than 0.0 or 1.0 (minimum and maximum).

Example:

x = randn(5, 4)
xint, its = intensities_encode(x, 0.05)
dbm = fitdbm(xint)
xgen = samples(dbm, 5)
intensities_decode(xgen, its)

BoltzmannMachines.intensities_encode — Function

intensities_encode(x)
intensities_encode(x, q1)
intensities_encode(x, q1, q2)

Performs a linear and monotonous transformation on the data set x to fit it into the interval [0.0, 1.0]. It returns the transformed data set as a first result and the information to reverse the tranformation as a second result. If you are only interested in the transformed values, you can use the function intensities.

If q1 is specified, all values below or equal to the quantile specified by q1 are mapped to 0.0. All values above or equal to the quantile specified by q2 are mapped to 1.0. q2 defaults to 1 - q1.

The quantiles are calculated per column/variable.

See also intensities_decode for the reverse transformation.

BoltzmannMachines.joindbms — Function

joindbms(dbms)
joindbms(dbms, visibleindexes)

Joins the DBMs given by the vector dbms by joining each layer of RBMs. The weights cross-linking the models are initialized with zeros.

If the vector visibleindexes is specified, it is supposed to contain in the i'th entry an indexing vector that determines the positions in the combined DBM for the visible nodes of the i'th of the dbms. By default the indexes of the visible nodes are assumed to be consecutive.

BoltzmannMachines.joinrbms — Method

joinrbms(rbms)
joinrbms(rbms, visibleindexes)

Joins the given vector of rbms of the same type to form one RBM of this type and returns the joined RBM. The weights cross-linking the models are initialized with zeros.

BoltzmannMachines.joinvecs — Function

joinvecs(vecs, indexes)

Combines the Float-vectors in vecs into one vector. The indexesvector must contain in the i'th entry the indexes that the elements of the i'th vector invecs` are supposed to have in the resulting combined vector.

BoltzmannMachines.joinweights — Method

joinweights(rbms)
joinweights(rbms, visibleindexes)

Combines the weight matrices of the RBMs in the vector rbms into one weight matrix and returns it.

If the vector visibleindexes is specified, it is supposed to contain in the i'th entry an indexing vector that determines the positions in the combined weight matrix for the visible nodes of the i'th of the rbms. By default the indexes of the visible nodes are assumed to be consecutive.

BoltzmannMachines.log1pexp — Method

log1pexp(x)

Calculates log(1+exp(x)). For sufficiently large values of x, the approximation log(1+exp(x)) ≈ x is used. This is useful to prevent overflow.

BoltzmannMachines.loglikelihood — Function

loglikelihood(rbm, x)
loglikelihood(rbm, x, logz)

Computes the average log-likelihood of an RBM on a given dataset x. Uses logz as value for the log of the partition function or estimates the partition function with Annealed Importance Sampling.

BoltzmannMachines.loglikelihood — Function

loglikelihood(dbm, x; ...)

Estimates the mean log-likelihood of the DBM on the data set x with Annealed Importance Sampling. This requires a separate run of AIS for each sample.

BoltzmannMachines.loglikelihooddiff — Method

loglikelihooddiff(rbm1, rbm2, x)
loglikelihooddiff(rbm1, rbm2, x, logzdiff)
loglikelihooddiff(rbm1, rbm2, x, logimpweights)

Computes difference of the loglikelihood functions of the two RBMs on the data matrix x, averaged over the samples. For this purpose, the partition function ratio Z2/Z1 is estimated by AIS unless the importance weights are specified the by parameter logimpweights or the difference in the log partition functions is given by logzdiff.

The first model is better than the second if the returned value is positive.

BoltzmannMachines.logmeanexp — Method

Performs numerically stable computation of the mean on log-scale.

BoltzmannMachines.logpartitionfunction — Method

logpartitionfunction(bm; ...)
logpartitionfunction(bm, logr)

Calculates or estimates the log of the partition function of the Boltzmann Machine bm.

r is an estimator of the ratio of the bm's partition function Z to the partition function Z0 of the reference BM with zero weights but same biases as the given bm. In case of a GaussianBernoulliRBM, the reference model also has the same standard deviation parameter. The estimated partition function of the Boltzmann Machine is Z = r * Z0 with r being the mean of the importance weights. Therefore, the log of the estimated partition function is log(Z) = log(r) + log(Z_0)

If the log of r is not given as argument logr, Annealed Importance Sampling (AIS) is performed to get a value for it. In this case, the optional arguments for AIS can be specified (see aislogimpweights), and the optional boolean argument parallelized can be used to turn on batch-parallelized computing of the importance weights.

BoltzmannMachines.logpartitionfunctionzeroweights — Method

logpartitionfunctionzeroweights(bm)

Returns the value of the log of the partition function of the Boltzmann Machine that results when one sets the weights of bm to zero, and leaves the other parameters (biases) unchanged.

BoltzmannMachines.logproblowerbound — Function

logproblowerbound(dbm, x; ...)
logproblowerbound(dbm, x, logimpweights; ...)
logproblowerbound(dbm, x, logz; ...)

Estimates the mean of the variational lower bound for the log probability of the DBM on a given dataset x like described in Equation 38 in [Salakhutdinov, 2015]. The logarithmized partition function can be specified directly as logz or by giving the logimpweights from estimating the partition function with the Annealed Importance Sampling algorithm (AIS). (See aislogimpweights.) If neither logimpweights or logz is given, the partition function will be estimated by AIS with default parameters.

Optional keyword argument:

The approximate posterior distribution may be given as argument mu or is calculated by the mean-field method.

BoltzmannMachines.logsumexp — Method

Performs numerically stable summation on log-scale.

BoltzmannMachines.meanfield — Function

meanfield(dbm, x)
meanfield(dbm, x, eps)

Computes the mean-field approximation for the data set x and returns a matrix of particles for the DBM. The number of particles is equal to the number of samples in x. eps is the convergence criterion for the fix-point iteration, default 0.001. It is assumed that all nodes in in-between-layers are Bernoulli distributed.

BoltzmannMachines.means — Method

means(datadict)

Creates and returns a dictionary with the same keys as the given datadict. The values of the returned dictionary are the samples' means in the datadict.

BoltzmannMachines.monitorcordiff! — Method

monitorcordiff!(monitor, rbm, epoch, cordict)

Generates samples and records the distance of their correlation matrix to the correlation matrices for (original) datasets contained in the cordict.

BoltzmannMachines.monitored_fitdbm — Method

monitored_fitdbm(x; ...)

This function performs the same training procedure as fitdbm, but facilitates monitoring: It fits an DBM model on the data set x using greedy layerwise pre-training and subsequent fine-tuning and collects all the monitoring results during the training. The monitoring results are stored in a vector of Monitors, containing one element for each RBM layer and as last element the monitoring results for fine-tuning. (Monitoring elements from the pre-training of partitioned layers are again vectors, containing one element for each partition.) Both the collected monitoring results and the trained DBM are returned.

See also: monitored_stackrbms, monitored_traindbm!

Optional keyword arguments:

monitoring: Used for fine-tuning. A monitoring function or a vector of monitoring functions that accept four arguments:
1. a Monitor object, which is used to collect the result of the monitoring function(s)
2. the DBM
3. the epoch
4. the data used for monitoring.
By default, there is no monitoring of fine-tuning.
monitoringdata: a DataDict, which contains the data that is used for the monitoring. For the pre-training of the first layer and for fine-tuning, the data is passed directly to the monitoring function(s). For monitoring the pre-training of the higher RBM layers, the data is propagated through the layers below first. By default, the training data x is used for monitoring.
monitoringpretraining: Used for pre-training. A four-argument function like monitoring, but accepts as second argument an RBM. By default there is no monitoring of the pre-training.
monitoringdatapretraining: Monitoring data used only for pre-training. Defaults to monitoringdata.
Other specified keyword arguments are simply handed to fitdbm. For more information, please see the documentation there.

Example:

using Random; Random.seed!(1)
xtrain, xtest = splitdata(barsandstripes(100, 4), 0.5)
monitors, dbm = monitored_fitdbm(xtrain;
    monitoringpretraining = monitorreconstructionerror!,
    monitoring = monitorlogproblowerbound!,
    monitoringdata = DataDict("Training data" => xtrain, "Test data" => xtest),
    # some arguments for `fitdbm`:
    nhiddens = [4; 3], learningratepretraining = 0.01,
    learningrate = 0.05, epochspretraining = 100, epochs = 50)
using BoltzmannMachinesPlots
plotevaluation(monitors[1]) # view monitoring of first RBM
plotevaluation(monitors[2]) # view monitoring of second RBM
plotevaluation(monitors[3]) # view monitoring fine-tuning

BoltzmannMachines.monitored_fitrbm — Method

monitored_fitrbm(x; ...)

This function performs the same training procedure as fitrbm, but facilitates monitoring: It fits an RBM model on the data set x and collects monitoring results during the training in one Monitor object. Both the collected monitoring results and the trained RBM are returned.

Optional keyword arguments:

monitoring: A monitoring function or a vector of monitoring functions that accept four arguments:
1. a Monitor object, which is used to collect the result of the monitoring function(s)
2. the RBM
3. the epoch
4. the data used for monitoring.
By default, there is no monitoring.
monitoringdata: a DataDict, which contains the data that is used for monitoring and passed to the monitoring functions(s). By default, the training data x is used for monitoring.
Other specified keyword arguments are simply handed to fitrbm. For more information, please see the documentation there.

Example:

using Random; Random.seed!(0)
xtrain, xtest = splitdata(barsandstripes(100, 4), 0.3)
monitor, rbm = monitored_fitrbm(xtrain;
    monitoring = [monitorreconstructionerror!, monitorexactloglikelihood!],
    monitoringdata = DataDict("Training data" => xtrain, "Test data" => xtest),
    # some arguments for `fitrbm`:
    nhidden = 10, learningrate = 0.002, epochs = 200)
using BoltzmannMachinesPlots
plotevaluation(monitor, monitorreconstructionerror)
plotevaluation(monitor, monitorexactloglikelihood)

BoltzmannMachines.monitored_stackrbms — Method

monitored_stackrbms(x; ...)

This function performs the same training procedure as stackrbms, but facilitates monitoring: It trains a stack of RBMs using the data set x as input to the first layer and collects all the monitoring results during the training in a vector of Monitors, containing one element for each RBM layer. (Elements for partitioned layers are again vectors, containing one element for each partition.) Both the collected monitoring results and the stack of trained RBMs are returned.

Optional keyword arguments:

monitoring: A monitoring function or a vector of monitoring functions that accept four arguments:
1. a Monitor object, which is used to collect the result of the monitoring function(s)
2. the RBM
3. the epoch
4. the data used for monitoring.
By default, there is no monitoring.
monitoringdata: a DataDict, which contains the data that is used for monitoring. For the first layer, the data is passed directly to the monitoring function(s). For monitoring the training of the higher layers, the data is propagated through the layers below first. By default, the training data x is used for monitoring.
Other specified keyword arguments are simply handed to stackrbms. For more information, please see the documentation there.

Example:

using Random; Random.seed!(0)
xtrain, xtest = splitdata(barsandstripes(100, 4), 0.5)
monitors, rbm = monitored_stackrbms(xtrain;
    monitoring = monitorreconstructionerror!,
    monitoringdata = DataDict("Training data" => xtrain, "Test data" => xtest),
    # some arguments for `stackrbms`:
    nhiddens = [4; 3], learningrate = 0.005, epochs = 100)
using BoltzmannMachinesPlots
plotevaluation(monitors[1]) # view monitoring of first RBM
plotevaluation(monitors[2]) # view monitoring of second RBM

BoltzmannMachines.monitored_traindbm! — Method

monitored_traindbm!(dbm, x; ...)

This function performs the same training procedure as traindbm!, but facilitates monitoring: It performs fine-tuning of the given dbm on the data set x and collects monitoring results during the training in one Monitor object. Both the collected monitoring results and the trained dbm are returned.

Optional keyword arguments:

monitoring: A monitoring function or a vector of monitoring functions that accept four arguments:
1. a Monitor object, which is used to collect the result of the monitoring function(s)
2. the DBM
3. the epoch
4. the data used for monitoring.
By default, there is no monitoring.
monitoringdata: a DataDict, which contains the data that is used for monitoring and passed to the monitoring functions(s). By default, the training data x is used for monitoring.
Other specified keyword arguments are simply handed to traindbm!. For more information, please see the documentation there.

Example:

using Random; Random.seed!(0)
xtrain, xtest = splitdata(barsandstripes(100, 4), 0.1)
dbm = stackrbms(xtrain; predbm = true, epochs = 20)
monitor, dbm = monitored_traindbm!(dbm, xtrain;
    monitoring = monitorlogproblowerbound!,
    monitoringdata = DataDict("Training data" => xtrain, "Test data" => xtest),
    # some arguments for `traindbm!`:
    epochs = 100, learningrate = 0.1)
using BoltzmannMachinesPlots
plotevaluation(monitor)

BoltzmannMachines.monitorexactloglikelihood! — Method

monitorexactloglikelihood!(monitor, bm, epoch, datadict)

Computes the mean exact log-likelihood in the Boltzmann Machine model bm for the data sets in the DataDict datadict and stores this information in the Monitor monitor.

BoltzmannMachines.monitorfreeenergy! — Method

monitorfreeenergy!(monitor, rbm, epoch, datadict)

Computes the free energy for the datadict's data sets in the RBM model rbm and stores the information in the monitor.

BoltzmannMachines.monitorloglikelihood! — Method

monitorloglikelihood!(monitor, rbm, epoch, datadict)

Estimates the log-likelihood of the datadict's data sets in the RBM model rbm with AIS and stores the values, together with information about the variance of the estimator, in the monitor.

If there is more than one worker available, the computation is parallelized by default. Parallelization can be turned on or off with the optional boolean argument parallelized.

For the other optional keyword arguments, see aislogimportanceweights.

See also: logproblowerbound.

BoltzmannMachines.monitorreconstructionerror! — Method

monitorreconstructionerror!(monitor, rbm, epoch, datadict)

Computes the reconstruction error for the data sets in the datadict and the rbm and stores the values in the monitor.

BoltzmannMachines.monitorweightsnorm! — Method

monitorweightsnorm!(monitor, rbm, epoch)

Computes the L2-norm of the weights matrix and the bias vectors of the rbm and stores the values in the monitor. These values can give a hint how much the updates are changing the parameters during learning.

BoltzmannMachines.mostevenbatches — Function

mostevenbatches(ntasks)
mostevenbatches(ntasks, nbatches)

Splits a number of tasks ntasks into a number of batches nbatches. The number of batches is by default min(nworkers(), ntasks). The returned result is a vector containing the numbers of tasks for each batch.

BoltzmannMachines.mostspecifictype — Method

mostspecifictype(v)

Returns the most specific supertype for all elements in the vector v.

BoltzmannMachines.newparticleslike — Method

newparticleslike(particles)

Creates new and uninitialized particles of the same dimensions as the given particles.

BoltzmannMachines.next! — Method

next!(combination)

Sets the vector combination, containing a sequence of the values 0.0 and 1.0, to the next combination of 0.0s and 1.0s. Returns false if the new combination consists only of zeros; true otherwise.

BoltzmannMachines.next! — Method

next!(particle)

Sets particle to the next combination of nodes' activations. Returns false if the loop went through all combinations; true otherwise.

BoltzmannMachines.nextvisibles! — Method

nextvisible!(v, bm)

Sets v to a new combination of visible nodes' activations for the bm. Returns false, if there are no new combinations left; returns true otherwise.

BoltzmannMachines.nhiddennodes — Method

nhiddennodes(rbm)

Returns the number of visible nodes for an RBM.

BoltzmannMachines.nmodelparameters — Method

nmodelparameters(bm)

Returns the number of parameters in the Boltzmann Machine model bm.

BoltzmannMachines.nunits — Method

nunits(bm)

Returns an integer vector that contans in the i'th entry the number of nodes in the i'th layer of the bm.

BoltzmannMachines.nvisiblecombinations — Method

nvisiblecombinations(bm)

Returns the number of possible combinations of visible nodes' activations for a given bm that has a discrete distribution of visible nodes.

BoltzmannMachines.nvisiblenodes — Method

nvisiblenodes(rbm)

Returns the number of visible nodes for an RBM.

BoltzmannMachines.oneornone_decode — Method

oneornone_decode(x, categories)

Returns a dataset such that x .== oneornone_decode(oneornone_encode(x, categories), categories).

For more, see oneornone_encode.

BoltzmannMachines.oneornone_encode — Method

oneornone_encode(x, categories)

Expects a data set x containing values 0.0, 1.0, 2.0 ... encoding the categories. Returns a data set that encodes the variables/columns in x in multiple columns with only values 0.0 and 1.0, similiar to the one-hot encoding with the deviation that a zero is encoded as all-zeros.

The categories can be specified as

integer number if all variables have the same number of categories or as
integer vector, containing for each variable the number of categories encoded.

See also oneornone_decode for the reverse transformation.

BoltzmannMachines.piecewiselinearsequences — Method

piecewiselinearsequences(nsequences, nvariables; ...)

Generates a dataset consisting of samples with values that are piecewise linear functions of the variable index.

Optional named arguments: pbreak, breakval, see piecewiselinearsequencebundles.

BoltzmannMachines.propagateforward — Function

propagateforward(rbm, datadict, factor)

Returns a new DataDict containing the same labels as the given datadict but as mapped values it contains the hidden potential in the rbm of the original datasets. The factor is applied for calculating the hidden potential and is 1.0 by default.

BoltzmannMachines.randombatchmasks — Method

randombatchmasks(nsamples, batchsize)

Returns BitArray-Sets for the sample indices when training on a dataset with nsamples samples using minibatches of size batchsize.

BoltzmannMachines.ranges — Method

ranges(numbers)

Returns a vector of consecutive integer ranges, the first starting with 1. The i'th such range spans over numbers[i] items.

BoltzmannMachines.reconstructionerror — Function

reconstructionerror(rbm, x)

Computes the mean reconstruction error of the RBM on the dataset x.

BoltzmannMachines.reversedrbm — Method

Returns the GBRBM with weights such that hidden and visible of the given bgrbm are switched and a visible standard deviation of 1.

BoltzmannMachines.samplefrequencies — Method

samplefrequencies(x)

Returns a dictionary containing the rows of the data set x as keys and their relative frequencies as values.

BoltzmannMachines.samplehidden! — Method

samplehidden!(h, rbm, v)
samplehidden!(h, rbm, v, factor)

Like samplehidden, but stores the returned result in h.

BoltzmannMachines.samplehidden — Method

samplehidden(rbm, v)
samplehidden(rbm, v, factor)

Returns activations of the hidden nodes in the AbstractRBM rbm, sampled from the state v of the visible nodes. v may be a vector or a matrix that contains the samples in its rows. For the factor, see hiddenpotential(rbm, v, factor).

BoltzmannMachines.samplehiddenpotential! — Method

samplehiddenpotential!(h, rbm)

Samples the activation of the hidden nodes from the potential h and stores the returned result in h.

BoltzmannMachines.sampleparticles — Function

sampleparticles(bm, nparticles, burnin)

Samples in the Boltzmann Machine model bm by running nparticles parallel, randomly initialized Gibbs chains for burnin steps. Returns particles containing nparticles generated samples. See also: Particles.

BoltzmannMachines.samples — Method

samples(bm, nsamples; ...)

Generates nsamples samples from a Boltzmann machine model bm by running a Gibbs sampler. This can also be used for sampling from a conditional distribution (see argument conditions below.)

Optional keyword arguments:

burnin: Number of Gibbs sampling steps, defaults to 50.
conditions: Vector{Pair{Int,Float64}}, containing pairs of variables and their values that are to be conditioned on. E. g. [1 => 1.0, 3 => 0.0]
samplelast: boolean to indicate whether to sample in last step (true, default) or whether to use the activation potential.

BoltzmannMachines.samplevisible! — Method

samplevisible!(v, rbm, h)
samplevisible!(v, rbm, h, factor)

Like samplevisible, but stores the returned result in v.

BoltzmannMachines.samplevisible — Method

samplevisible(rbm, h)
samplevisible(rbm, h, factor)

Returns activations of the visible nodes in the AbstractRBM rbm, sampled from the state h of the hidden nodes. h may be a vector or a matrix that contains the samples in its rows. For the factor, see visiblepotential(rbm, h, factor).

BoltzmannMachines.samplevisiblepotential! — Method

samplehiddenpotential!(v, rbm)

Samples the activation of the visible nodes from the potential v and stores the returned result in v.

BoltzmannMachines.setmonitorsup! — Method

Creates monitors and sets the monitoring function in trainlayer such that the monitoring is recorded in the newly created monitors. Returns the created monitors.

BoltzmannMachines.softmax0! — Method

softmax0!(x)
softmax0!(x, varranges)

If x is a vector, softmax0!(x) will apply the softmax transformation to the vector [x; 0.0] and store the results for the values of x in x. (The value for 0.0 is omitted since it is determined by 1 - sum(softmax!(x))).

If x is a matrix, the transformation will be applied to all rows of x. If an additional vector varranges with UnitRanges of column indices is specified, the transformation will be applied to the groups of columns separately.

BoltzmannMachines.splitdata — Method

splitdata(x, ratio)

Splits the data set x randomly in two data sets x1 and x2, such that the fraction of samples in x2 is equal to (or as close as possible to) the given ratio.

Example:

trainingdata, testdata = splitdata(data, 0.1) # Use 10 % as test data

BoltzmannMachines.stackrbms — Method

stackrbms(x; ...)

Performs greedy layerwise training for Deep Belief Networks or greedy layerwise pretraining for Deep Boltzmann Machines and returns the trained model.

Optional keyword arguments (ordered by importance):

predbm: boolean indicating that the greedy layerwise training is pre-training for a DBM. If its value is false (default), a DBN is trained.
nhiddens: vector containing the number of nodes of the i'th hidden layer in the i'th entry
epochs: number of training epochs
learningrate: learningrate, default 0.005
batchsize: size of minibatches, defaults to 1
trainlayers: a vector of TrainLayer objects. With this argument it is possible to specify the training parameters for each layer/RBM individually. If the number of training epochs and the learning rate are not specified explicitly for a layer, the values of epochs and learningrate are used. For more information see help of TrainLayer.
monitoringdata: a data dictionary (see type DataDict) The data is propagated forward through the network to monitor higher levels. If a non-empty dictionary is given, the monitoring functions in the trainlayers-arguments must accept a DataDict as third argument.
optimizer: an optimizer (of type AbstractOptimizer) that is used for computing the gradients when training the individual RBMs.
samplehidden: boolean indicating that consequent layers are to be trained with sampled values instead of the deterministic potential. Using the deterministic potential (false) is the default.

See also: monitored_stackrbms for a more convenient monitoring.

BoltzmannMachines.stackrbms_preparetrainlayers — Method

Prepares the layerwise training specifications for stackrbms

BoltzmannMachines.stackrbms_trainlayer — Method

Trains a layer without partitioning for stackrbms.

BoltzmannMachines.stackrbms_trainlayer — Method

Trains a partitioned layer for stackrbms.

BoltzmannMachines.stackrbms_trainlayers — Method

The layerwise training, using, the specifications in trainlayers.

BoltzmannMachines.top2latentdims — Method

top2latentdims(dbm, x)

Get a two-dimensional representation for all the samples/rows in the data set x, employing a given dbm for the dimension reduction. For achieving the dimension reduction, at first the mean-field activation induced by the samples in the hidden nodes of the last/top hidden layer of the dbm is calculated. The mean-field activation of the top hidden nodes is logit-transformed to get a better separation. If the number of hidden nodes in the last hidden layer is greater than 2, a principal component analysis (PCA) is used on these logit-transformed mean-field values to obtain a two-dimensional representation. The result is a matrix with 2 columns, each row belonging to a sample/row in x.

BoltzmannMachines.traindbm! — Method

traindbm!(dbm, x, particles, learningrate)

Trains the given dbm for one epoch.

BoltzmannMachines.traindbm! — Method

traindbm!(dbm, x; ...)

Trains the dbm (a BasicDBM or a more general MultimodalDBM) using the learning procedure for a general Boltzmann Machine with the training data set x. A learning step consists of mean-field inference (positive phase), stochastic approximation by Gibbs Sampling (negative phase) and the parameter updates.

Optional keyword arguments (ordered by importance):

epoch: number of training epochs
learningrate/learningrates: a vector of learning rates for each epoch to update the weights and biases. The learning rates should decrease with the epochs, e. g. with the factor a / (b + epoch). If only one value is given as learningrate, a and b are 11.0 and 10.0, respectively.
batchsize: number of samples in mini-batches. No mini-batches are used by default, which means that the complete data set is used for calculating the gradient in each epoch.
nparticles: number of particles used for sampling, default 100
monitoring: A function that is executed after each training epoch. It has to accept the trained DBM and the current epoch as arguments.

BoltzmannMachines.trainrbm! — Method

trainrbm!(rbm, x)

Trains the given rbm for one epoch using the data set x. (See also function fitrbm.)

Optional keyword arguments:

learningrate, cdsteps, sdlearningrate, upfactor, downfactor, optimizer: See documentation of function fitrbm.
chainstate: a matrix for holding the states of the RBM's hidden nodes. If it is specified, PCD is used.

BoltzmannMachines.unnormalizedlogprob — Method

unnormalizedlogprob(mdbm, x; ...)

Estimates the mean unnormalized log probability of the samples (rows in x) in the MultimodalDBM mdbm by running the Annealed Importance Sampling (AIS) in a smaller modified DBM for each sample.

The named optional arguments for AIS can be specified here. (See aislogimpweights)

BoltzmannMachines.unnormalizedprobhidden — Method

unnormalizedprobhidden(rbm, h)
unnormalizedprobhidden(gbrbm, h)

Calculates the unnormalized probability of the rbm's hidden nodes' activations given by h.

BoltzmannMachines.unnormalizedproboddlayers — Function

Computes the unnormalized probability of the nodes in layers with odd indexes, i. e. p*(v, h2, h4, ...).

BoltzmannMachines.unnormalizedprobs — Method

unnormalizedprobs(bm, samples)

Calculates the unnormalized probabilities for all samples (vector of vectors), in the Boltzmann Machine bm.

The visible nodes of the bm must be Bernoulli distributed.

BoltzmannMachines.updateparameters! — Method

updateparameters!(rbm, optimizer)

Updates the RBM rbm by walking a step in the direction of the gradient that has been computed by calling computegradient! on optimizer.

BoltzmannMachines.visibleinput! — Method

visibleinput!(v, rbm, h)

Like visibleinput but stores the returned result in v.

BoltzmannMachines.visibleinput — Method

visibleinput(rbm, h)

Returns activations of the visible nodes in the AbstractXBernoulliRBM rbm, sampled from the state h of the hidden nodes. h may be a vector or a matrix that contains the samples in its rows.

BoltzmannMachines.visiblepotential! — Method

visiblepotential!(v, rbm, h)

Like visiblepotential but stores the returned result in v.

BoltzmannMachines.visiblepotential — Method

visiblepotential(rbm, h)
visiblepotential(rbm, h, factor)

Returns the potential for activations of the visible nodes in the AbstractRBM rbm, given the activations h of the hidden nodes. h may be a vector or a matrix that contains the samples in its rows. The potential is a deterministic value to which sampling can be applied to get the activations.

The total input can be scaled with the factor. This is needed when pretraining the rbm as part of a DBM.

In RBMs with Bernoulli distributed visible units, the potential of the visible nodes is the vector of probabilities for them to be turned on.

For a Binomial2BernoulliRBM, the visible units are sampled from a Binomial(2,p) distribution in the Gibbs steps. In this case, the potential is the vector of values for 2p. (The value is doubled to get a value in the same range as the sampled one.)

For GaussianBernoulliRBMs, the potential of the visible nodes is the vector of means of the Gaussian distributions for each node.

BoltzmannMachines.weightsinput! — Method

weightsinput!(input, input2, dbm, particles)

Computes the input that results only from the weights (without biases) and the previous states in particles for all nodes in the DBM dbm and stores it in input. The state of the particles and the dbm is not altered. input2 must have the same size as input and particle. For performance reasons, input2 is used as preallocated space for storing intermediate results.