BoltzmannMachines.AbstractOptimizer
— TypeThe AbstractOptimizer
interface allows to specify optimization procedures. It consists of three methods:
initialized(optimizer, bm)
: May be used for creating an optimizer that is specifically initialized for the Boltzmann machinebm
. In particular it may be used to allocate reusable space for the gradient. The default implementation simply returns the unmodifiedoptimizer
.computegradient!(optimizer, v, vmodel, h, hmodel, rbm)
orcomputegradient!(optimizer, meanfieldparticles, gibbsparticles, dbm)
needs to be implemented for computing the gradient given the samples from the positive and negative phase.updateparameters!(bm, optimizer)
needs to be specified for taking the gradient step. The default implementation for RBMs expects the fieldslearningrate
andgradient
and addslearningrate * gradient
to the given RBM.
BoltzmannMachines.AbstractRBM
— TypeAbstract supertype for all RBMs
BoltzmannMachines.AbstractTrainLayer
— TypeAbstract supertype for layerwise training specification. May be specifications for a normal RBM layer (see TrainLayer
) or multiple combined specifications for a partitioned layer (see TrainPartitionedLayer
).
BoltzmannMachines.AbstractXBernoulliRBM
— TypeAbstract super type for RBMs with binary and Bernoulli distributed hidden nodes.
BoltzmannMachines.BernoulliGaussianRBM
— TypeBernoulliGaussianRBM(weights, visbias, hidbias)
Encapsulates the parameters of an RBM with Bernoulli distributed visible nodes and Gaussian distributed hidden nodes. The standard deviation of the Gaussian distribution is 1.
BoltzmannMachines.BernoulliRBM
— TypeBernoulliRBM(weights, visbias, hidbias)
Encapsulates the parameters of an RBM with Bernoulli distributed nodes.
weights
: matrix of weights with size (number of visible nodes, number of hidden nodes)visbias
: bias vector for visible nodeshidbias
: bias vector for hidden nodes
BoltzmannMachines.Binomial2BernoulliRBM
— TypeBinomial2BernoulliRBM(weights, visbias, hidbias)
Encapsulates the parameters of an RBM with 0/1/2-valued, Binomial (n=2) distributed visible nodes, and Bernoulli distributed hidden nodes. This model is equivalent to a BernoulliRBM in which every two visible nodes are connected with the same weights to each hidden node. The states (0,0) / (1,0) / (0,1) / (1,1) of the visible nodes connected with with the same weights translate as states 0 / 1 / 1 / 2 in the Binomial2BernoulliRBM.
BoltzmannMachines.DataDict
— TypeA dictionary containing names of data sets as keys and the data sets (matrices with samples in rows) as values.
BoltzmannMachines.GaussianBernoulliRBM
— TypeGaussianBernoulliRBM(weights, visbias, hidbias, sd)
Encapsulates the parameters of an RBM with Gaussian distributed visible nodes and Bernoulli distributed hidden nodes.
BoltzmannMachines.GaussianBernoulliRBM2
— TypeGaussianBernoulliRBM2(weights, visbias, hidbias, sd)
Encapsulates the parameters of an RBM with Gaussian distributed visible nodes and Bernoulli distributed hidden nodes with the alternative energy formula proposed by KyungHyun Cho.
BoltzmannMachines.IntensityTransformation
— TypeEncapsulates all data needed to transform a vector of values into the interval [0.0, 1.0] by a linear and monotonous transformation.
See intensities_encode
, intensities_decode
for the usage.
BoltzmannMachines.LoglikelihoodOptimizer
— TypeImplements the AbstractOptimizer
interface for optimizing the loglikelihood with stochastic gradient descent.
BoltzmannMachines.Monitor
— TypeA vector for collecting MonitoringItem
s during training.
BoltzmannMachines.MonitoringItem
— TypeEncapsulates the value of an evaluation calculated in one training epoch. If the evaluation depends on a dataset, the dataset's name can be specified also.
BoltzmannMachines.MultivariateBernoulliDistribution
— TypeMultivariateBernoulliDistribution(bm)
Calculates and stores the probabilities for all possible combinations of a multivariate Bernoulli distribution defined by a Boltzmann machine model with Bernoulli distributed visible nodes. Can be used for sampling from this distribution, see samples
.
BoltzmannMachines.NoRBM
— TypeSingleton-Placeholder for AbstractRBM
s
BoltzmannMachines.Particles
— TypeParticles
are an array of matrices. The i'th matrix contains in each row the vector of states of the nodes of the i'th layer of an RBM or a DBM. The set of rows with the same index define an activation state in a Boltzmann Machine. Therefore, the size of the i'th matrix is (number of samples/particles, number of nodes in layer i).
BoltzmannMachines.PartitionedBernoulliDBM
— TypeA DBM with only Bernoulli distributed nodes which may contain partitioned layers.
BoltzmannMachines.PartitionedRBM
— TypePartitionedRBM(rbms)
Encapsulates several (parallel) AbstractRBMs that form one partitioned RBM. The nodes of the parallel RBMs are not connected between the RBMs.
BoltzmannMachines.StackedOptimizer
— TypeStackedOptimizer(optimizers)
Can be used for optimizing a stack of RBMs / a DBM by using the given the vector of optimizers
(one for each RBM). For more information about the concept of optimizers, see AbstractOptimizer
.
BoltzmannMachines.TrainLayer
— MethodSpecify parameters for training one RBM-layer in a DBM.
Optional keyword arguments:
- The optional keyword arguments
rbmtype
,nhidden
,epochs
,learningrate
/learningrates
,sdlearningrate
/sdlearningrates
,categories
,batchsize
,pcd
,cdsteps
,startrbm
andoptimizer
/optimizers
are passed tofitrbm
. For a detailed description, see there. If a negative value is specified forlearningrate
orepochs
, this indicates that a corresponding default value should be used (parameter defined by call tostackrbms
). monitoring
: also like infitrbm
, but may take aDataDict
as third argument (see functionstackrbms
and its argumentmonitoringdata
).nvisible
: Number of visible units in the RBM. Only relevant for partitioning. This parameter is derived as much as possible bystackrbms
. ForMultimodalDBM
s with a partitioned first layer, it is necessary to specify the number of visible nodes for all but at most one partition in the input layer.
BoltzmannMachines.TrainPartitionedLayer
— TypeEncapsulates a vector of TrainLayer
objects for training a partitioned layer.
BoltzmannMachines.aislogimpweights
— Methodaislogimpweights(rbm; ...)
Computes the logarithmised importance weights for estimating the ratio of the partition functions of the given rbm
to the RBM with zero weights, but same visible and hidden bias as the rbm
. This function implements the Annealed Importance Sampling algorithm (AIS) like described in section 4.1.3 of [Salakhutdinov, 2008].
Optional keyword arguments (for all types of Boltzmann Machines):
ntemperatures
: Number of temperatures for annealing from the starting model to the target model, defaults to 100temperatures
: Vector of temperatures. By defaultntemperatures
ascending numbers, equally spaced from 0.0 to 1.0nparticles
: Number of parallel chains and calculated weights, defaults to 100burnin
: Number of steps to sample for the Gibbs transition between models
BoltzmannMachines.aislogimpweights
— Methodaislogimpweights(dbm; ...)
Computes the logarithmised importance weights in the Annealed Importance Sampling algorithm (AIS) for estimating the ratio of the partition functions of the given DBM dbm
to the base-rate DBM with all weights being zero and all biases equal to the biases of the dbm
.
Implements algorithm 4 in [Salakhutdinov+Hinton, 2012]. For DBMs with Bernoulli-distributed nodes only (i. e. here DBMs of type PartitionedBernoulliDBM
), it is possible to calculate the importance weights by summing out either the even layers (h1, h3, ...) or the odd layers (v, h2, h4, ...). In the first case, the nodes' activations in the odd layers are used to calculate the probability ratios, in the second case the even layer are used. If dbm
is of type PartitionedBernoulliDBM
, the optional keyword argument sumout
can be used to choose by specifying the values :odd
(default) or :even
. In the case of MultimodalDBM
s, it is not possible to choose and the second case applies there.
BoltzmannMachines.aislogimpweights
— Methodaislogimpweights(rbm1, rbm2; ...)
Computes the logarithmised importance weights for estimating the log-ratio log(Z2/Z1) for the partition functions Z1 and Z2 of rbm1
and rbm2
, respectively. Implements the procedure described in section 4.1.2 of [Salakhutdinov, 2008]. This requires that rbm1
and rbm2
are of the same type and have the same number of visible units.
BoltzmannMachines.aisprecision
— Functionaisprecision(logr, aissd, sdrange)
Returns the differences of the estimated logratio r
to the lower and upper bound of the range defined by the multiple sdrange
of the standard deviation of the ratio's estimator aissd
.
BoltzmannMachines.aisprecision
— Functionaisprecision(logimpweights, sdrange)
BoltzmannMachines.aisstandarddeviation
— MethodComputes the standard deviation of the AIS estimator (not logarithmised) (eq 4.10 in [Salakhutdinov+Hinton, 2012]) given the logarithmised importance weights.
BoltzmannMachines.aisupdatelogimpweights!
— MethodUpdates the logarithmized importance weights logimpweights
in AIS by adding the log ratio of unnormalized probabilities of the states of the odd layers in the PartitionedBernoulliDBM dbm
. The activation states of the DBM's nodes are given by the particles
. For performance reasons, the biases are specified separately.
BoltzmannMachines.akaikeinformationcriterion
— Methodakaikeinformationcriterion(bm, loglikelihood)
Calculates the Akaike information criterion for a Boltzmann Machine, given its loglikelihood
.
BoltzmannMachines.barsandstripes
— Methodbarsandstripes(nsamples, nvariables)
Generates a test data set. To see the structure in the data set, run e. g. reshape(barsandstripes(1, 16), 4, 4)
a few times.
Example from: MacKay, D. (2003). Information Theory, Inference, and Learning Algorithms
BoltzmannMachines.batchparallelized
— Methodbatchparallelized(f, n, op)
Distributes the work for executing the function f
n
times on all the available workers and reduces the results with the operator op
. f
is a function that gets a number (of tasks) to execute the tasks.
Example:
batchparallelized(n -> aislogimpweights(dbm; nparticles = n), 100, vcat)
BoltzmannMachines.bayesianinformationcriterion
— Methodbayesianinformationcriterion(bm, nvariables, loglikelihood)
Calculates the Akaike information criterion for a Boltzmann machine, given its loglikelihood
and the number of samples nsamples
.
BoltzmannMachines.bernoulliloglikelihoodbaserate
— Methodbernoulliloglikelihoodbaserate(nvariables)
Calculates the log-likelihood for a random sample in the "base-rate" BM with all parameters being zero and thus all visible units being independent and Bernoulli distributed.
BoltzmannMachines.bernoulliloglikelihoodbaserate
— Methodbernoulliloglikelihoodbaserate(x)
Calculates the log-likelihood for the data set x
in the "base-rate" BM with all weights being zero and visible bias set to the empirical probability of the samples' components in x
being 1.
BoltzmannMachines.blocksinnoise
— Methodblocksinnoise(nsamples, nvariables; ...)
Produces an artificial data set where there are sequences of consecutive 1s in half of the binary samples. The samples are labeled whether they belong one of the nblocks
. The samples are otherwise randomly generated with Bernoulli distributed noise. The first return value is a matrix that has nsamples
rows and nvariables
columns containing zeros and ones as values. The second return value is a vector of labels.
Optional named arguments:
noise
: Bernoulli distributed noise (probability for 1s)blocklen
: length of sequences of consecutive variables set to 1 in subgroups of samplesnblocks
: number of different locations for sequences, i.e. the number of different subgroups with sequences of 1s
BoltzmannMachines.combined_monitoring
— MethodReturn a new function that does all the monitoring using the monitoring
function (or functions) and the monitoringdata
and stores the result in the given monitor
.
BoltzmannMachines.combinedbiases
— Methodcombinedbiases(dbm)
Returns a vector containing in the i'th element the bias vector for the i'th layer of the dbm
. For intermediate layers, visible and hidden biases are combined to a single bias vector.
BoltzmannMachines.computegradient!
— Methodcomputegradient!(optimizer, v, vmodel, h, hmodel, rbm)
Computes the gradient of the RBM rbm
given the the hidden activation h
induced by the sample v
and the vectors vmodel
and hmodel
generated by sampling from the model. The result is stored in the optimizer
in such a way that it can be applied by a call to updateparameters!
. There is no return value.
For RBMs (excluding PartitionedRBMs), this means saving the gradient in a RBM of the same type in the field optimizer.gradient
.
BoltzmannMachines.converttomostspecifictype
— MethodConverts a vector to a vector of the most specific type that all elements share as common supertype.
BoltzmannMachines.copyannealed!
— Methodcopyannealed!(annealedrbm, rbm, temperature)
Copies all parameters that are to be annealed from the RBM rbm
to the RBM annealedrbm
and anneals them with the given temperature
.
BoltzmannMachines.correlations
— Methodcorrelations(datadict)
Creates and returns a dictionary with the same keys as the given datadict
. The values of the returned dictionary are the correlations of the samples in the datasets given as values in the datadict
.
BoltzmannMachines.crossvalidation
— Methodcrossvalidation(x, monitoredfit; ...)
Performs k-fold cross-validation, given
- the data set
x
and monitoredfit
: a function that fits and evaluates a model. As arguments it must accept:- a training data data set
- a
DataDict
containing the evaluation data.
The return values of the calls to the monitoredfit
function are concatenated with vcat
. If the monitoredfit function returns Monitor
objects, crossvalidation
returns a combined Monitor
object that can be displayed by creating a cross-validation plot via BoltzmannMachinesPlots.crossvalidationplot
.
Optional named argument:
kfold
: specifies thek
in "k
-fold" (defaults to 10).crossvalidation(x, monitoredfit, pars; ...)
If additionaly a vector of parameters pars
is given, monitoredfit
also expects an additional parameter from the parameter set.
BoltzmannMachines.crossvalidationargs
— Methodcrossvalidationargs(x, pars...; )
Returns a tuple of argument vectors containing the parameters for a function such as the monitoredfit
argument in crossvalidation
.
Usage example: map(monitoredfit, crossvalidationargs(x)...)
Optional named argument:
kfold
: seecrossvalidation
.
BoltzmannMachines.curvebundles
— Methodcurvebundles(...)
Generates an example dataset that can be visualized as bundles of trend curves with added noise. Additional binary columns with labels may be added.
Optional named arguments:
nbundles
: number of bundlesnperbundle
: number of sequences per bundlenvariables
: number of variables in the sequencesnoisesd
: standard deviation of the noise added on all sequencesaddlabels
: add leading columns to the resulting dataset, specifying the membership to a bundlepbreak
: probability that an intermediate point in a sequence is a breakpoint, defaults to 0.2.breakval
: a function that expects no input and generates a single (random) value for a defining point of a piecewise linear sequence. Defaults torand
.
Example:
To quickly grasp the idea, plot generated samples against the variable index:
x = BMs.curvebundles(nvariables = 10, nbundles = 3,
nperbundle = 4, noisesd = 0.03,
addlabels = true)
BoltzmannMachinesPlots.plotcurvebundles(x)
BoltzmannMachines.empiricalloglikelihood
— Methodempiricalloglikelihood(x, xgen)
empiricalloglikelihood(bm, x, nparticles)
empiricalloglikelihood(bm, x, nparticles, burnin)
Computes the mean empirical loglikelihood for the data set x
. The probability of a sample is estimated to be the empirical probability of the sample in a dataset generated by the model. This data set can be given as xgen
or it is generated by running a Gibbs sampler with nparticles
for burnin
steps (default 5) in the Boltzmann Machine bm
. Throws an error if a sample in x
is not contained in the generated data set.
BoltzmannMachines.emptyfunc
— MethodA function accepting everything, doing nothing. Usable as default argument for functions as arguments.W
BoltzmannMachines.energy
— Methodenergy(rbm, v, h)
Computes the energy of the configuration of the visible nodes v
and the hidden nodes h
, specified as vectors, in the rbm
.
energyzerohiddens(rbm, v)
Computes the energy for the visible activations v
in the RBM rbm
, if all hidden nodes have zero activation, i. e. yields the same as energy(rbm, v, zeros(rbm.hidbias))
.
BoltzmannMachines.exactloglikelihood
— Functionexactloglikelihood(rbm, x)
Computes the mean log-likelihood for the given dataset x
and the RBM rbm
exactly. The log of the partition function is computed exactly by exactlogpartitionfunction(rbm)
. Besides that, the function simply calls loglikelihood(rbm, x)
.
BoltzmannMachines.exactloglikelihood
— Functionexactloglikelihood(dbm, x)
exactloglikelihood(dbm, x, logz)
Computes the mean log-likelihood for the given dataset x
and the DBM dbm
exactly. If the value of the log of the partition function of the dbm
is not supplied as argument logz
, it will be computed by exactlogpartitionfunction(dbm)
.
BoltzmannMachines.exactlogpartitionfunction
— Methodexactlogpartitionfunction(bgrbm)
Calculates the log of the partition function of the BernoulliGaussianRBM bgrbm
exactly. The execution time grows exponentially with the number of visible nodes.
BoltzmannMachines.exactlogpartitionfunction
— Methodexactlogpartitionfunction(rbm)
Calculates the log of the partition function of the BernoulliRBM rbm
exactly. The execution time grows exponentially with the minimum of (number of visible nodes, number of hidden nodes).
BoltzmannMachines.exactlogpartitionfunction
— Methodexactlogpartitionfunction(gbrbm)
Calculates the log of the partition function of the GaussianBernoulliRBM gbrbm
exactly. The execution time grows exponentially with the number of hidden nodes.
BoltzmannMachines.exactlogpartitionfunction
— Methodexactlogpartitionfunction(mdbm)
Calculates the log of the partition function of the MultimodalDBM mdbm
exactly. The execution time grows exponentially with the total number of nodes in hidden layers with odd indexes (i. e. h1, h3, ...).
BoltzmannMachines.exactlogpartitionfunction
— Methodexactlogpartitionfunction(dbm)
Calculates the log of the partition function of the DBM dbm
exactly. If the number of hidden layers is even, the execution time grows exponentially with the total number of nodes in hidden layers with odd indexes (i. e. h1, h3, ...). If the number of hidden layers is odd, the execution time grows exponentially with the minimum of (number of nodes in layers with even index, number of nodes in layers with odd index).
BoltzmannMachines.fitdbm
— Methodfitdbm(x; ...)
Fits a (multimodal) DBM to the data set x
. The procedure consists of two parts: First a stack of RBMs is pretrained in a greedy layerwise manner (see stackrbms(x)
). Then the weights of all layers are jointly trained using the general Boltzmann Machine learning procedure (fine tuning, see traindbm!(dbm,x)
).
Optional keyword arguments (ordered by importance):
nhiddens
: vector that defines the number of nodes in the hidden layers of the DBM. The default value specifies two hidden layers with the same size as the visible layer.epochs
: number of training epochs for joint training, defaults to 10epochspretraining
: number of training epochs for pretraining, defaults toepochs
learningrate
: learning rate for pretraining. Also used as initial value for the decaying fine tuning learning rate.learningratepretraining
: learning rate for pretraining, defaults tolearningrate
learningratefinetuning
: initial learning rate for fine tuning. The learning rate for fine tuning is decaying with the number of epochs, starting with the given value for thelearningratefinetuning
or thelearningrate
. (For more details seetraindbm!
.)learningratesfinetuning
: The learning rate for fine tuning is by default decaying with the number of epochs, starting with the value of thelearningrate
. (For more details seetraindbm!
.) The value of the learning rate for each epoch of fine tuning can be specified via the argumentlearningratesfinetuning
as a vector with an entry for each of the epochs.learningrates
: deprecated, otherwise equivalent tolearningratesfinetuning
batchsize
: number of samples in mini-batches for pretraining and fine tuning. By default, a batchsize of 1 is used for pretraining. For fine tuning, no mini-batches are used by default, which means that the complete data set is used for calculating the gradient in each epoch.batchsizepretraining
: batchsize for pretraining, defaults to 1batchsizefinetuning
: batchsize for fine tuning. Defaults to the number of samples in the data set, i.e., no mini batches are used.nparticles
: number of particles used for sampling during joint training of DBM, default 100pretraining
: The arguments for layerwise pretraining can be specified for each layer individually. This is done via a vector ofTrainLayer
objects. (For a detailed description of the possible parameters, see help forTrainLayer
). If the number of training epochs and the learning rate are not specified explicitly for a layer, the values ofepochspretraining
,learningratepretraining
andbatchsizepretraining
are used.monitoring
: Monitoring function accepting adbm
and the number of epochs, returning nothing. Used for the monitoring of fine-tuning. See alsomonitored_fitdbm
for a more convenient way of monitoring.monitoringdatapretraining
: aDataDict
that contains data used for monitoring the pretraining (see argumentmonitoringdata
ofstackrbms
.)optimizer
/optimizers
: an optimizer or a vector of optimizers for each epoch (seeAbstractOptimizer
) used for fine-tuning.optimizerpretraining
: an optimizer used for pre-training. Defaults to theoptimizer
.
BoltzmannMachines.fitrbm
— Methodfitrbm(x; ...)
Fits an RBM model to the data set x
, using Stochastic Gradient Descent (SGD) with Contrastive Divergence (CD), and returns it.
Optional keyword arguments (ordered by importance):
rbmtype
: the type of the RBM that is to be trained This must be a subtype ofAbstractRBM
and defaults toBernoulliRBM
.nhidden
: number of hidden units for the returned RBMepochs
: number of training epochslearningrate
/learningrates
: The learning rate for the weights and biases can be specified as single value, used throughout all epochs, or as a vector oflearningrates
that contains a value for each epoch. Defaults to 0.005.batchsize
: number of samples that are used for making one step in the stochastic gradient descent optimizer algorithm. Default is 1.pcd
: indicating whether Persistent Contrastive Divergence (PCD) is to be used (true, default) or simple CD that initializes the Gibbs Chain with the training sample (false)cdsteps
: number of Gibbs sampling steps for (persistent) contrastive divergence, defaults to 1monitoring
: a function that is executed after each training epoch. It takes an RBM and the epoch as arguments. See alsomonitored_fitrbm
for another way of monitoring.categories
: only relevant ifrbmtype = Softmax0BernoulliRBM
. The number of categories asInt
, if all variables have the same number of categories, or asVector{Int}
that contains the number of categories of the i'th categorical variable in the i'th entry.upfactor
,downfactor
: If this function is used for pretraining a part of a DBM, it is necessary to multiply the weights of the RBM with factors.sdlearningrate
/sdlearningrates
: learning rate(s) for the standard deviation if training aGaussianBernoulliRBM
orGaussianBernoulliRBM2
. Ignored for other types of RBMs. It usually must be much smaller than the learning rates for the weights. By default it is 0.0, which means that the standard deviation is not learned.startrbm
: start training with the parameters of the given RBM. If this argument is specified,nhidden
andrbmtype
are ignored.optimizer
/optimizers
: an object of typeAbstractOptimizer
or a vector of them for each epoch. If specified, the optimization is performed as implemented by the given optimizer type. By default, theLoglikelihoodOptimizer
with thelearningrate
/learningrates
andsdlearningrate
/sdlearningrates
is used. For other types of optimizers, the learning rates must be specified in theoptimizer
. For more information on how to write your own optimizer, seeAbstractOptimizer
.
See also: monitored_fitrbm
for a convenient monitoring of the training.
BoltzmannMachines.freeenergy
— Methodfreeenergy(rbm, x)
Computes the average free energy of the samples in the dataset x
for the AbstractRBM rbm
.
BoltzmannMachines.freeenergy
— Methodfreeenergy(rbm, v)
Computes the free energy of the sample v
(a vector) for the rbm
.
BoltzmannMachines.freeenergydiffs
— Methodfreeeenergydiffs(rbm1, rbm2, x)
Computes the differences of the free energy for the samples in the dataset x
regarding the RBM models rbm1
and rbm2
. Returns a vector of differences.
BoltzmannMachines.gaussianloglikelihoodbaserate
— Methodgaussianloglikelihoodbaserate(x)
Calculates the mean log-likelihood for the data set x
with all variables and components of the variables being independent and Gaussian distributed. The standard deviation and the mean of the i'th variable is the mean and standard deviation of values of the i'th component of the sample vectors.
BoltzmannMachines.gibbssample!
— Functiongibbssample!(particles, bm, nsteps)
Performs Gibbs sampling on the particles
in the Boltzmann machine model bm
for nsteps
steps. (See also: Particles
.) When sampling in multimodal deep Boltzmann machines, in-between layers are assumed to contain only Bernoulli-distributed nodes.
BoltzmannMachines.gibbssamplecond!
— Functiongibbssamplecond!(particles, bm, cond, nsteps)
Conditional Gibbs sampling on the particles
in the bm
for nsteps
Gibbs sampling steps.
The variables that are marked in the indexing vector cond
are fixed to the initial values in particles
during sampling. This way, conditional sampling is performed on these variables.
See also: Particles
, initparticles
hiddeninput!(h, rbm, v)
Like hiddeninput
, but stores the returned result in h
.
hiddeninput(rbm, v)
Computes the total input of the hidden units in the AbstractRBM rbm
, given the activations of the visible units v
. v
may be a vector or a matrix that contains the samples in its rows.
hiddenpotential!(hh, rbm, vv)
hiddenpotential!(hh, rbm, vv, factor)
Like hiddenpotential
, but stores the returned result in hh
.
hiddenpotential(rbm, v)
hiddenpotential(rbm, v, factor)
Returns the potential for activations of the hidden nodes in the AbstractRBM rbm
, given the activations v
of the visible nodes. v
may be a vector or a matrix that contains the samples in its rows. The potential is a deterministic value to which sampling can be applied to get the activations. In RBMs with Bernoulli distributed hidden units, the potential of the hidden nodes is the vector of probabilities for them to be turned on.
The total input can be scaled with the factor
. This is needed when pretraining the rbm
as part of a DBM.
BoltzmannMachines.initcombination
— MethodReturns particle for DBM, initialized with zeros.
BoltzmannMachines.initcombinationoddlayersonly
— Methodinitcombinationoddlayersonly(dbm)
Creates and zero-initializes a particle for layers with odd indexes in the dbm
.
BoltzmannMachines.initialized
— Methodinitialized(optimizer, rbm)
Returns an AbstractOptimizer
similar to the given optimizer
that can be used to optimize the AbstractRBM
rbm
.
BoltzmannMachines.initparticles
— Methodinitparticles(bm, nparticles; biased = false)
Creates particles for Gibbs sampling in an Boltzmann machine bm
. (See also: Particles
)
For Bernoulli distributed nodes, the particles are initialized with Bernoulli(p) distributed values. If biased == false
, p is 0.5, otherwise the results of applying the sigmoid function to the bias values are used as values for the nodes' individual p's.
Gaussian nodes are sampled from a normal distribution if biased == false
. If biased == true
the mean of the Gaussian distribution is shifted by the bias vector and the standard deviation of the nodes is used for sampling.
BoltzmannMachines.initrbm
— Functioninitrbm(x, nhidden)
initrbm(x, nhidden, rbmtype)
Creates a RBM with nhidden
hidden units and initalizes its weights for training on dataset x
. rbmtype
can be a subtype of AbstractRBM
, default is BernoulliRBM
.
BoltzmannMachines.initvisiblebias
— Methodinitvisiblebias(x)
Returns sensible initial values for the visible bias for training an RBM on the data set x
.
BoltzmannMachines.intensities
— Functionintensities(x)
intensities(x, q1)
intensities(x, q1, q2)
Performs a linear and monotonous transformation on the data set x
to fit it the values into the interval [0.0, 1.0]. For more information see intensities_encode
, intensities_decode
.
BoltzmannMachines.intensities_decode
— Methodintensities_decode(x, its)
Backtransforms the intensity values in the data set x
(values in the interval [0.0, 1.0])to the range of the original values and returns the new data set or vector. The
itsargument contains the information about the transformation, as it is returned by
intensities_encode`.
Note that the range is truncated if the original transformation used other quantiles than 0.0 or 1.0 (minimum and maximum).
Example:
x = randn(5, 4)
xint, its = intensities_encode(x, 0.05)
dbm = fitdbm(xint)
xgen = samples(dbm, 5)
intensities_decode(xgen, its)
BoltzmannMachines.intensities_encode
— Functionintensities_encode(x)
intensities_encode(x, q1)
intensities_encode(x, q1, q2)
Performs a linear and monotonous transformation on the data set x
to fit it into the interval [0.0, 1.0]. It returns the transformed data set as a first result and the information to reverse the tranformation as a second result. If you are only interested in the transformed values, you can use the function intensities
.
If q1
is specified, all values below or equal to the quantile specified by q1
are mapped to 0.0. All values above or equal to the quantile specified by q2
are mapped to 1.0. q2
defaults to 1 - q1
.
The quantiles are calculated per column/variable.
See also intensities_decode
for the reverse transformation.
BoltzmannMachines.joindbms
— Functionjoindbms(dbms)
joindbms(dbms, visibleindexes)
Joins the DBMs given by the vector dbms
by joining each layer of RBMs. The weights cross-linking the models are initialized with zeros.
If the vector visibleindexes
is specified, it is supposed to contain in the i'th entry an indexing vector that determines the positions in the combined DBM for the visible nodes of the i'th of the dbms
. By default the indexes of the visible nodes are assumed to be consecutive.
BoltzmannMachines.joinrbms
— Methodjoinrbms(rbms)
joinrbms(rbms, visibleindexes)
Joins the given vector of rbms
of the same type to form one RBM of this type and returns the joined RBM. The weights cross-linking the models are initialized with zeros.
BoltzmannMachines.joinvecs
— Functionjoinvecs(vecs, indexes)
Combines the Float-vectors in vecs
into one vector. The indexes
vector must contain in the i'th entry the indexes that the elements of the i'th vector in
vecs` are supposed to have in the resulting combined vector.
BoltzmannMachines.joinweights
— Methodjoinweights(rbms)
joinweights(rbms, visibleindexes)
Combines the weight matrices of the RBMs in the vector rbms
into one weight matrix and returns it.
If the vector visibleindexes
is specified, it is supposed to contain in the i'th entry an indexing vector that determines the positions in the combined weight matrix for the visible nodes of the i'th of the rbms
. By default the indexes of the visible nodes are assumed to be consecutive.
BoltzmannMachines.log1pexp
— Methodlog1pexp(x)
Calculates log(1+exp(x)). For sufficiently large values of x, the approximation log(1+exp(x)) ≈ x is used. This is useful to prevent overflow.
BoltzmannMachines.loglikelihood
— Functionloglikelihood(rbm, x)
loglikelihood(rbm, x, logz)
Computes the average log-likelihood of an RBM on a given dataset x
. Uses logz
as value for the log of the partition function or estimates the partition function with Annealed Importance Sampling.
BoltzmannMachines.loglikelihood
— Functionloglikelihood(dbm, x; ...)
Estimates the mean log-likelihood of the DBM on the data set x
with Annealed Importance Sampling. This requires a separate run of AIS for each sample.
BoltzmannMachines.loglikelihooddiff
— Methodloglikelihooddiff(rbm1, rbm2, x)
loglikelihooddiff(rbm1, rbm2, x, logzdiff)
loglikelihooddiff(rbm1, rbm2, x, logimpweights)
Computes difference of the loglikelihood functions of the two RBMs on the data matrix x
, averaged over the samples. For this purpose, the partition function ratio Z2/Z1 is estimated by AIS unless the importance weights are specified the by parameter logimpweights
or the difference in the log partition functions is given by logzdiff
.
The first model is better than the second if the returned value is positive.
BoltzmannMachines.logmeanexp
— MethodPerforms numerically stable computation of the mean on log-scale.
BoltzmannMachines.logpartitionfunction
— Methodlogpartitionfunction(bm; ...)
logpartitionfunction(bm, logr)
Calculates or estimates the log of the partition function of the Boltzmann Machine bm
.
r
is an estimator of the ratio of the bm
's partition function Z to the partition function Z0 of the reference BM with zero weights but same biases as the given bm
. In case of a GaussianBernoulliRBM, the reference model also has the same standard deviation parameter. The estimated partition function of the Boltzmann Machine is Z = r * Z0 with r
being the mean of the importance weights. Therefore, the log of the estimated partition function is log(Z) = log(r) + log(Z_0)
If the log of r
is not given as argument logr
, Annealed Importance Sampling (AIS) is performed to get a value for it. In this case, the optional arguments for AIS can be specified (see aislogimpweights
), and the optional boolean argument parallelized
can be used to turn on batch-parallelized computing of the importance weights.
BoltzmannMachines.logpartitionfunctionzeroweights
— Methodlogpartitionfunctionzeroweights(bm)
Returns the value of the log of the partition function of the Boltzmann Machine that results when one sets the weights of bm
to zero, and leaves the other parameters (biases) unchanged.
BoltzmannMachines.logproblowerbound
— Functionlogproblowerbound(dbm, x; ...)
logproblowerbound(dbm, x, logimpweights; ...)
logproblowerbound(dbm, x, logz; ...)
Estimates the mean of the variational lower bound for the log probability of the DBM on a given dataset x
like described in Equation 38 in [Salakhutdinov, 2015]. The logarithmized partition function can be specified directly as logz
or by giving the logimpweights
from estimating the partition function with the Annealed Importance Sampling algorithm (AIS). (See aislogimpweights
.) If neither logimpweights
or logz
is given, the partition function will be estimated by AIS with default parameters.
Optional keyword argument:
- The approximate posterior distribution may be given as argument
mu
or is calculated by the mean-field method.
BoltzmannMachines.logsumexp
— MethodPerforms numerically stable summation on log-scale.
BoltzmannMachines.meanfield
— Functionmeanfield(dbm, x)
meanfield(dbm, x, eps)
Computes the mean-field approximation for the data set x
and returns a matrix of particles for the DBM. The number of particles is equal to the number of samples in x
. eps
is the convergence criterion for the fix-point iteration, default 0.001. It is assumed that all nodes in in-between-layers are Bernoulli distributed.
BoltzmannMachines.means
— Methodmeans(datadict)
Creates and returns a dictionary with the same keys as the given datadict
. The values of the returned dictionary are the samples' means in the datadict
.
BoltzmannMachines.monitorcordiff!
— Methodmonitorcordiff!(monitor, rbm, epoch, cordict)
Generates samples and records the distance of their correlation matrix to the correlation matrices for (original) datasets contained in the cordict
.
BoltzmannMachines.monitored_fitdbm
— Methodmonitored_fitdbm(x; ...)
This function performs the same training procedure as fitdbm
, but facilitates monitoring: It fits an DBM model on the data set x
using greedy layerwise pre-training and subsequent fine-tuning and collects all the monitoring results during the training. The monitoring results are stored in a vector of Monitor
s, containing one element for each RBM layer and as last element the monitoring results for fine-tuning. (Monitoring elements from the pre-training of partitioned layers are again vectors, containing one element for each partition.) Both the collected monitoring results and the trained DBM are returned.
See also: monitored_stackrbms
, monitored_traindbm!
Optional keyword arguments:
monitoring
: Used for fine-tuning. A monitoring function or a vector of monitoring functions that accept four arguments:- a
Monitor
object, which is used to collect the result of the monitoring function(s) - the DBM
- the epoch
- the data used for monitoring.
- a
monitoringdata
: aDataDict
, which contains the data that is used for the monitoring. For the pre-training of the first layer and for fine-tuning, the data is passed directly to themonitoring
function(s). For monitoring the pre-training of the higher RBM layers, the data is propagated through the layers below first. By default, the training datax
is used for monitoring.monitoringpretraining
: Used for pre-training. A four-argument function likemonitoring
, but accepts as second argument an RBM. By default there is no monitoring of the pre-training.monitoringdatapretraining
: Monitoring data used only for pre-training. Defaults tomonitoringdata
.- Other specified keyword arguments are simply handed to
fitdbm
. For more information, please see the documentation there.
Example:
using Random; Random.seed!(1)
xtrain, xtest = splitdata(barsandstripes(100, 4), 0.5)
monitors, dbm = monitored_fitdbm(xtrain;
monitoringpretraining = monitorreconstructionerror!,
monitoring = monitorlogproblowerbound!,
monitoringdata = DataDict("Training data" => xtrain, "Test data" => xtest),
# some arguments for `fitdbm`:
nhiddens = [4; 3], learningratepretraining = 0.01,
learningrate = 0.05, epochspretraining = 100, epochs = 50)
using BoltzmannMachinesPlots
plotevaluation(monitors[1]) # view monitoring of first RBM
plotevaluation(monitors[2]) # view monitoring of second RBM
plotevaluation(monitors[3]) # view monitoring fine-tuning
BoltzmannMachines.monitored_fitrbm
— Methodmonitored_fitrbm(x; ...)
This function performs the same training procedure as fitrbm
, but facilitates monitoring: It fits an RBM model on the data set x
and collects monitoring results during the training in one Monitor
object. Both the collected monitoring results and the trained RBM are returned.
Optional keyword arguments:
monitoring
: A monitoring function or a vector of monitoring functions that accept four arguments:- a
Monitor
object, which is used to collect the result of the monitoring function(s) - the RBM
- the epoch
- the data used for monitoring.
- a
monitoringdata
: aDataDict
, which contains the data that is used for monitoring and passed to themonitoring
functions(s). By default, the training datax
is used for monitoring.- Other specified keyword arguments are simply handed to
fitrbm
. For more information, please see the documentation there.
Example:
using Random; Random.seed!(0)
xtrain, xtest = splitdata(barsandstripes(100, 4), 0.3)
monitor, rbm = monitored_fitrbm(xtrain;
monitoring = [monitorreconstructionerror!, monitorexactloglikelihood!],
monitoringdata = DataDict("Training data" => xtrain, "Test data" => xtest),
# some arguments for `fitrbm`:
nhidden = 10, learningrate = 0.002, epochs = 200)
using BoltzmannMachinesPlots
plotevaluation(monitor, monitorreconstructionerror)
plotevaluation(monitor, monitorexactloglikelihood)
BoltzmannMachines.monitored_stackrbms
— Methodmonitored_stackrbms(x; ...)
This function performs the same training procedure as stackrbms
, but facilitates monitoring: It trains a stack of RBMs using the data set x
as input to the first layer and collects all the monitoring results during the training in a vector of Monitor
s, containing one element for each RBM layer. (Elements for partitioned layers are again vectors, containing one element for each partition.) Both the collected monitoring results and the stack of trained RBMs are returned.
Optional keyword arguments:
monitoring
: A monitoring function or a vector of monitoring functions that accept four arguments:- a
Monitor
object, which is used to collect the result of the monitoring function(s) - the RBM
- the epoch
- the data used for monitoring.
- a
monitoringdata
: aDataDict
, which contains the data that is used for monitoring. For the first layer, the data is passed directly to themonitoring
function(s). For monitoring the training of the higher layers, the data is propagated through the layers below first. By default, the training datax
is used for monitoring.- Other specified keyword arguments are simply handed to
stackrbms
. For more information, please see the documentation there.
Example:
using Random; Random.seed!(0)
xtrain, xtest = splitdata(barsandstripes(100, 4), 0.5)
monitors, rbm = monitored_stackrbms(xtrain;
monitoring = monitorreconstructionerror!,
monitoringdata = DataDict("Training data" => xtrain, "Test data" => xtest),
# some arguments for `stackrbms`:
nhiddens = [4; 3], learningrate = 0.005, epochs = 100)
using BoltzmannMachinesPlots
plotevaluation(monitors[1]) # view monitoring of first RBM
plotevaluation(monitors[2]) # view monitoring of second RBM
BoltzmannMachines.monitored_traindbm!
— Methodmonitored_traindbm!(dbm, x; ...)
This function performs the same training procedure as traindbm!
, but facilitates monitoring: It performs fine-tuning of the given dbm
on the data set x
and collects monitoring results during the training in one Monitor
object. Both the collected monitoring results and the trained dbm
are returned.
Optional keyword arguments:
monitoring
: A monitoring function or a vector of monitoring functions that accept four arguments:- a
Monitor
object, which is used to collect the result of the monitoring function(s) - the DBM
- the epoch
- the data used for monitoring.
- a
monitoringdata
: aDataDict
, which contains the data that is used for monitoring and passed to themonitoring
functions(s). By default, the training datax
is used for monitoring.- Other specified keyword arguments are simply handed to
traindbm!
. For more information, please see the documentation there.
Example:
using Random; Random.seed!(0)
xtrain, xtest = splitdata(barsandstripes(100, 4), 0.1)
dbm = stackrbms(xtrain; predbm = true, epochs = 20)
monitor, dbm = monitored_traindbm!(dbm, xtrain;
monitoring = monitorlogproblowerbound!,
monitoringdata = DataDict("Training data" => xtrain, "Test data" => xtest),
# some arguments for `traindbm!`:
epochs = 100, learningrate = 0.1)
using BoltzmannMachinesPlots
plotevaluation(monitor)
BoltzmannMachines.monitorexactloglikelihood!
— Methodmonitorexactloglikelihood!(monitor, bm, epoch, datadict)
Computes the mean exact log-likelihood in the Boltzmann Machine model bm
for the data sets in the DataDict datadict
and stores this information in the Monitor monitor
.
BoltzmannMachines.monitorfreeenergy!
— Methodmonitorfreeenergy!(monitor, rbm, epoch, datadict)
Computes the free energy for the datadict
's data sets in the RBM model rbm
and stores the information in the monitor
.
BoltzmannMachines.monitorloglikelihood!
— Methodmonitorloglikelihood!(monitor, rbm, epoch, datadict)
Estimates the log-likelihood of the datadict
's data sets in the RBM model rbm
with AIS and stores the values, together with information about the variance of the estimator, in the monitor
.
If there is more than one worker available, the computation is parallelized by default. Parallelization can be turned on or off with the optional boolean argument parallelized
.
For the other optional keyword arguments, see aislogimportanceweights
.
See also: loglikelihood
.
BoltzmannMachines.monitorlogproblowerbound!
— Methodmonitorlogproblowerbound!(monitor, dbm, epoch, datadict)
Estimates the lower bound of the log probability of the datadict
's data sets in the DBM dbm
with AIS and stores the values, together with information about the variance of the estimator, in the monitor
.
If there is more than one worker available, the computation is parallelized by default. Parallelization can be turned on or off with the optional boolean argument parallelized
.
For the other optional keyword arguments, see aislogimpweights
.
See also: logproblowerbound
.
BoltzmannMachines.monitorreconstructionerror!
— Methodmonitorreconstructionerror!(monitor, rbm, epoch, datadict)
Computes the reconstruction error for the data sets in the datadict
and the rbm
and stores the values in the monitor
.
BoltzmannMachines.monitorweightsnorm!
— Methodmonitorweightsnorm!(monitor, rbm, epoch)
Computes the L2-norm of the weights matrix and the bias vectors of the rbm
and stores the values in the monitor
. These values can give a hint how much the updates are changing the parameters during learning.
BoltzmannMachines.mostevenbatches
— Functionmostevenbatches(ntasks)
mostevenbatches(ntasks, nbatches)
Splits a number of tasks ntasks
into a number of batches nbatches
. The number of batches is by default min(nworkers(), ntasks)
. The returned result is a vector containing the numbers of tasks for each batch.
BoltzmannMachines.mostspecifictype
— Methodmostspecifictype(v)
Returns the most specific supertype for all elements in the vector v
.
BoltzmannMachines.newparticleslike
— Methodnewparticleslike(particles)
Creates new and uninitialized particles of the same dimensions as the given particles
.
BoltzmannMachines.next!
— Methodnext!(combination)
Sets the vector combination
, containing a sequence of the values 0.0 and 1.0, to the next combination of 0.0s and 1.0s. Returns false if the new combination consists only of zeros; true otherwise.
BoltzmannMachines.next!
— Methodnext!(particle)
Sets particle
to the next combination of nodes' activations. Returns false if the loop went through all combinations; true otherwise.
BoltzmannMachines.nextvisibles!
— Methodnextvisible!(v, bm)
Sets v
to a new combination of visible nodes' activations for the bm
. Returns false, if there are no new combinations left; returns true otherwise.
nhiddennodes(rbm)
Returns the number of visible nodes for an RBM.
BoltzmannMachines.nmodelparameters
— Methodnmodelparameters(bm)
Returns the number of parameters in the Boltzmann Machine model bm
.
BoltzmannMachines.nunits
— Methodnunits(bm)
Returns an integer vector that contans in the i'th entry the number of nodes in the i'th layer of the bm
.
BoltzmannMachines.nvisiblecombinations
— Methodnvisiblecombinations(bm)
Returns the number of possible combinations of visible nodes' activations for a given bm
that has a discrete distribution of visible nodes.
BoltzmannMachines.nvisiblenodes
— Methodnvisiblenodes(rbm)
Returns the number of visible nodes for an RBM.
BoltzmannMachines.oneornone_decode
— Methodoneornone_decode(x, categories)
Returns a dataset such that x .== oneornone_decode(oneornone_encode(x, categories), categories)
.
For more, see oneornone_encode
.
BoltzmannMachines.oneornone_encode
— Methodoneornone_encode(x, categories)
Expects a data set x
containing values 0.0, 1.0, 2.0 ... encoding the categories. Returns a data set that encodes the variables/columns in x
in multiple columns with only values 0.0 and 1.0, similiar to the one-hot encoding with the deviation that a zero is encoded as all-zeros.
The categories
can be specified as
- integer number if all variables have the same number of categories or as
- integer vector, containing for each variable the number of categories encoded.
See also oneornone_decode
for the reverse transformation.
BoltzmannMachines.piecewiselinearsequences
— Methodpiecewiselinearsequences(nsequences, nvariables; ...)
Generates a dataset consisting of samples with values that are piecewise linear functions of the variable index.
Optional named arguments: pbreak
, breakval
, see piecewiselinearsequencebundles
.
BoltzmannMachines.propagateforward
— Functionpropagateforward(rbm, datadict, factor)
Returns a new DataDict
containing the same labels as the given datadict
but as mapped values it contains the hidden potential in the rbm
of the original datasets. The factor is applied for calculating the hidden potential and is 1.0 by default.
BoltzmannMachines.randombatchmasks
— Methodrandombatchmasks(nsamples, batchsize)
Returns BitArray-Sets for the sample indices when training on a dataset with nsamples
samples using minibatches of size batchsize
.
BoltzmannMachines.ranges
— Methodranges(numbers)
Returns a vector of consecutive integer ranges, the first starting with 1. The i'th such range spans over numbers[i]
items.
BoltzmannMachines.reconstructionerror
— Functionreconstructionerror(rbm, x)
Computes the mean reconstruction error of the RBM on the dataset x
.
BoltzmannMachines.reversedrbm
— MethodReturns the GBRBM with weights such that hidden and visible of the given bgrbm
are switched and a visible standard deviation of 1.
BoltzmannMachines.samplefrequencies
— Methodsamplefrequencies(x)
Returns a dictionary containing the rows of the data set x
as keys and their relative frequencies as values.
samplehidden!(h, rbm, v)
samplehidden!(h, rbm, v, factor)
Like samplehidden
, but stores the returned result in h
.
samplehidden(rbm, v)
samplehidden(rbm, v, factor)
Returns activations of the hidden nodes in the AbstractRBM rbm
, sampled from the state v
of the visible nodes. v
may be a vector or a matrix that contains the samples in its rows. For the factor
, see hiddenpotential(rbm, v, factor)
.
samplehiddenpotential!(h, rbm)
Samples the activation of the hidden nodes from the potential h
and stores the returned result in h
.
BoltzmannMachines.sampleparticles
— Functionsampleparticles(bm, nparticles, burnin)
Samples in the Boltzmann Machine model bm
by running nparticles
parallel, randomly initialized Gibbs chains for burnin
steps. Returns particles containing nparticles
generated samples. See also: Particles
.
BoltzmannMachines.samples
— Methodsamples(bm, nsamples; ...)
Generates nsamples
samples from a Boltzmann machine model bm
by running a Gibbs sampler. This can also be used for sampling from a conditional distribution (see argument conditions
below.)
Optional keyword arguments:
burnin
: Number of Gibbs sampling steps, defaults to 50.conditions
:Vector{Pair{Int,Float64}}
, containing pairs of variables and their values that are to be conditioned on. E. g.[1 => 1.0, 3 => 0.0]
samplelast
: boolean to indicate whether to sample in last step (true, default) or whether to use the activation potential.
BoltzmannMachines.samplevisible!
— Methodsamplevisible!(v, rbm, h)
samplevisible!(v, rbm, h, factor)
Like samplevisible
, but stores the returned result in v
.
BoltzmannMachines.samplevisible
— Methodsamplevisible(rbm, h)
samplevisible(rbm, h, factor)
Returns activations of the visible nodes in the AbstractRBM rbm
, sampled from the state h
of the hidden nodes. h
may be a vector or a matrix that contains the samples in its rows. For the factor
, see visiblepotential(rbm, h, factor)
.
BoltzmannMachines.samplevisiblepotential!
— Methodsamplehiddenpotential!(v, rbm)
Samples the activation of the visible nodes from the potential v
and stores the returned result in v
.
BoltzmannMachines.setmonitorsup!
— MethodCreates monitors and sets the monitoring function in trainlayer
such that the monitoring is recorded in the newly created monitors. Returns the created monitors.
BoltzmannMachines.softmax0!
— Methodsoftmax0!(x)
softmax0!(x, varranges)
If x
is a vector, softmax0!(x)
will apply the softmax transformation to the vector [x; 0.0]
and store the results for the values of x
in x
. (The value for 0.0 is omitted since it is determined by 1 - sum(softmax!(x))
).
If x
is a matrix, the transformation will be applied to all rows of x
. If an additional vector varranges
with UnitRange
s of column indices is specified, the transformation will be applied to the groups of columns separately.
BoltzmannMachines.splitdata
— Methodsplitdata(x, ratio)
Splits the data set x
randomly in two data sets x1
and x2
, such that the fraction of samples in x2
is equal to (or as close as possible to) the given ratio
.
Example:
trainingdata, testdata = splitdata(data, 0.1) # Use 10 % as test data
BoltzmannMachines.stackrbms
— Methodstackrbms(x; ...)
Performs greedy layerwise training for Deep Belief Networks or greedy layerwise pretraining for Deep Boltzmann Machines and returns the trained model.
Optional keyword arguments (ordered by importance):
predbm
: boolean indicating that the greedy layerwise training is pre-training for a DBM. If its value is false (default), a DBN is trained.nhiddens
: vector containing the number of nodes of the i'th hidden layer in the i'th entryepochs
: number of training epochslearningrate
: learningrate, default 0.005batchsize
: size of minibatches, defaults to 1trainlayers
: a vector ofTrainLayer
objects. With this argument it is possible to specify the training parameters for each layer/RBM individually. If the number of training epochs and the learning rate are not specified explicitly for a layer, the values ofepochs
andlearningrate
are used. For more information see help ofTrainLayer
.monitoringdata
: a data dictionary (see typeDataDict
) The data is propagated forward through the network to monitor higher levels. If a non-empty dictionary is given, the monitoring functions in thetrainlayers
-arguments must accept aDataDict
as third argument.optimizer
: an optimizer (of typeAbstractOptimizer
) that is used for computing the gradients when training the individual RBMs.samplehidden
: boolean indicating that consequent layers are to be trained with sampled values instead of the deterministic potential. Using the deterministic potential (false
) is the default.
See also: monitored_stackrbms
for a more convenient monitoring.
BoltzmannMachines.stackrbms_preparetrainlayers
— MethodPrepares the layerwise training specifications for stackrbms
BoltzmannMachines.stackrbms_trainlayer
— MethodTrains a layer without partitioning for stackrbms
.
BoltzmannMachines.stackrbms_trainlayer
— MethodTrains a partitioned layer for stackrbms
.
BoltzmannMachines.stackrbms_trainlayers
— MethodThe layerwise training, using, the specifications in trainlayers
.
BoltzmannMachines.top2latentdims
— Methodtop2latentdims(dbm, x)
Get a two-dimensional representation for all the samples/rows in the data set x
, employing a given dbm
for the dimension reduction. For achieving the dimension reduction, at first the mean-field activation induced by the samples in the hidden nodes of the last/top hidden layer of the dbm
is calculated. The mean-field activation of the top hidden nodes is logit-transformed to get a better separation. If the number of hidden nodes in the last hidden layer is greater than 2, a principal component analysis (PCA) is used on these logit-transformed mean-field values to obtain a two-dimensional representation. The result is a matrix with 2 columns, each row belonging to a sample/row in x
.
BoltzmannMachines.traindbm!
— Methodtraindbm!(dbm, x, particles, learningrate)
Trains the given dbm
for one epoch.
BoltzmannMachines.traindbm!
— Methodtraindbm!(dbm, x; ...)
Trains the dbm
(a BasicDBM
or a more general MultimodalDBM
) using the learning procedure for a general Boltzmann Machine with the training data set x
. A learning step consists of mean-field inference (positive phase), stochastic approximation by Gibbs Sampling (negative phase) and the parameter updates.
Optional keyword arguments (ordered by importance):
epoch
: number of training epochslearningrate
/learningrates
: a vector of learning rates for each epoch to update the weights and biases. The learning rates should decrease with the epochs, e. g. with the factora / (b + epoch)
. If only one value is given aslearningrate
,a
andb
are 11.0 and 10.0, respectively.batchsize
: number of samples in mini-batches. No mini-batches are used by default, which means that the complete data set is used for calculating the gradient in each epoch.nparticles
: number of particles used for sampling, default 100monitoring
: A function that is executed after each training epoch. It has to accept the trained DBM and the current epoch as arguments.
BoltzmannMachines.trainrbm!
— Methodtrainrbm!(rbm, x)
Trains the given rbm
for one epoch using the data set x
. (See also function fitrbm
.)
Optional keyword arguments:
learningrate
,cdsteps
,sdlearningrate
,upfactor
,downfactor
,optimizer
: See documentation of functionfitrbm
.chainstate
: a matrix for holding the states of the RBM's hidden nodes. If it is specified, PCD is used.
BoltzmannMachines.unnormalizedlogprob
— Methodunnormalizedlogprob(mdbm, x; ...)
Estimates the mean unnormalized log probability of the samples (rows in x
) in the MultimodalDBM mdbm
by running the Annealed Importance Sampling (AIS) in a smaller modified DBM for each sample.
The named optional arguments for AIS can be specified here. (See aislogimpweights
)
unnormalizedprobhidden(rbm, h)
unnormalizedprobhidden(gbrbm, h)
Calculates the unnormalized probability of the rbm
's hidden nodes' activations given by h
.
BoltzmannMachines.unnormalizedproboddlayers
— FunctionComputes the unnormalized probability of the nodes in layers with odd indexes, i. e. p*(v, h2, h4, ...).
BoltzmannMachines.unnormalizedprobs
— Methodunnormalizedprobs(bm, samples)
Calculates the unnormalized probabilities for all samples
(vector of vectors), in the Boltzmann Machine bm
.
The visible nodes of the bm
must be Bernoulli distributed.
BoltzmannMachines.updateparameters!
— Methodupdateparameters!(rbm, optimizer)
Updates the RBM rbm
by walking a step in the direction of the gradient that has been computed by calling computegradient!
on optimizer
.
BoltzmannMachines.visibleinput!
— Methodvisibleinput!(v, rbm, h)
Like visibleinput
but stores the returned result in v
.
BoltzmannMachines.visibleinput
— Methodvisibleinput(rbm, h)
Returns activations of the visible nodes in the AbstractXBernoulliRBM rbm
, sampled from the state h
of the hidden nodes. h
may be a vector or a matrix that contains the samples in its rows.
BoltzmannMachines.visiblepotential!
— Methodvisiblepotential!(v, rbm, h)
Like visiblepotential
but stores the returned result in v
.
BoltzmannMachines.visiblepotential
— Methodvisiblepotential(rbm, h)
visiblepotential(rbm, h, factor)
Returns the potential for activations of the visible nodes in the AbstractRBM rbm
, given the activations h
of the hidden nodes. h
may be a vector or a matrix that contains the samples in its rows. The potential is a deterministic value to which sampling can be applied to get the activations.
The total input can be scaled with the factor
. This is needed when pretraining the rbm
as part of a DBM.
In RBMs with Bernoulli distributed visible units, the potential of the visible nodes is the vector of probabilities for them to be turned on.
For a Binomial2BernoulliRBM, the visible units are sampled from a Binomial(2,p) distribution in the Gibbs steps. In this case, the potential is the vector of values for 2p. (The value is doubled to get a value in the same range as the sampled one.)
For GaussianBernoulliRBMs, the potential of the visible nodes is the vector of means of the Gaussian distributions for each node.
BoltzmannMachines.weightsinput!
— Methodweightsinput!(input, input2, dbm, particles)
Computes the input that results only from the weights (without biases) and the previous states in particles
for all nodes in the DBM dbm
and stores it in input
. The state of the particles
and the dbm
is not altered. input2
must have the same size as input
and particle
. For performance reasons, input2
is used as preallocated space for storing intermediate results.