# Encoders & Decoders

`AutoEncoderToolkit.jl`

provides a set of predefined encoders and decoders that can be used to define custom (variational) autoencoder architectures.

## Encoders

The tree structure of the encoder types looks like this (🧱 represents concrete types):

`AbstractEncoder`

`AbstractDeterministicEncoder`

`AbstractVariationalEncoder`

`AbstractGaussianEncoder`

`AbstractGaussianLinearEncoder`

`AbstractGaussianLogEncoder`

`Encoder`

`AutoEncoderToolkit.Encoder`

— Type`struct Encoder{E<:Union{Flux.Chain,Flux.Dense}} <: AbstractDeterministicEncoder`

Default encoder function for deterministic autoencoders. The `encoder`

network is used to map the input data directly into the latent space representation.

**Fields**

`encoder::Union{Flux.Chain,Flux.Dense}`

: The primary neural network used to process input data and map it into a latent space representation.

**Example**

`enc = Encoder(Flux.Chain(Dense(784, 400, relu), Dense(400, 20)))`

`AutoEncoderToolkit.Encoder`

— Method`(encoder::Encoder)(x)`

Forward propagate the input `x`

through the `Encoder`

to obtain the encoded representation in the latent space.

**Arguments**

`x::Array`

: Input data to be encoded.

**Returns**

`z`

: Encoded representation of the input data in the latent space.

**Description**

This method allows for a direct call on an instance of `Encoder`

with the input data `x`

. It runs the input through the encoder network and outputs the encoded representation in the latent space.

**Example**

```
enc = Encoder(...)
z = enc(some_input)
```

**Note**

Ensure that the input x matches the expected dimensionality of the encoder's input layer.

`JointGaussianEncoder`

`AutoEncoderToolkit.JointGaussianEncoder`

— Type`struct JointGaussianEncoder <: AbstractGaussianLinearEncoder`

Encoder function for variational autoencoders where the same `encoder`

network is used to map to the latent space mean `µ`

and standard deviation `σ`

.

**Fields**

`encoder::Flux.Chain`

: The primary neural network used to process input data and map it into a latent space representation.`µ::Flux.Dense`

: A dense layer mapping from the output of the`encoder`

to the mean of the latent space.`σ::Flux.Dense`

: A dense layer mapping from the output of the`encoder`

to the standard deviation of the latent space.

**Example**

```
enc = JointGaussianEncoder(
Flux.Chain(Dense(784, 400, relu)), Flux.Dense(400, 20), Flux.Dense(400, 20)
)
```

`AutoEncoderToolkit.JointGaussianEncoder`

— Method` (encoder::JointGaussianEncoder)(x::AbstractArray)`

Forward propagate the input `x`

through the `JointGaussianEncoder`

to obtain the mean (`µ`

) and standard deviation (`σ`

) of the latent space.

**Arguments**

`x::AbstractArray`

: Input data to be encoded.

**Returns**

- A NamedTuple
`(µ=µ, σ=σ,)`

where:`µ`

: Mean of the latent space after passing the input through the encoder and subsequently through the`µ`

layer.`σ`

: Standard deviation of the latent space after passing the input through the encoder and subsequently through the`σ`

layer.

**Description**

This method allows for a direct call on an instance of `JointGaussianEncoder`

with the input data `x`

. It first runs the input through the encoder network, then maps the output of the last encoder layer to both the mean and standard deviation of the latent space.

**Example**

```
je = JointGaussianEncoder(...)
µ, σ = je(some_input)
```

**Note**

Ensure that the input x matches the expected dimensionality of the encoder's input layer.

`JointGaussianLogEncoder`

`AutoEncoderToolkit.JointGaussianLogEncoder`

— Type`struct JointGaussianLogEncoder <: AbstractGaussianLogEncoder`

Default encoder function for variational autoencoders where the same `encoder`

network is used to map to the latent space mean `µ`

and log standard deviation `logσ`

.

**Fields**

`encoder::Flux.Chain`

: The primary neural network used to process input data and map it into a latent space representation.`µ::Union{Flux.Dense,Flux.Chain}`

: A dense layer or a chain of layers mapping from the output of the`encoder`

to the mean of the latent space.`logσ::Union{Flux.Dense,Flux.Chain}`

: A dense layer or a chain of layers mapping from the output of the`encoder`

to the log standard deviation of the latent space.

**Example**

```
enc = JointGaussianLogEncoder(
Flux.Chain(Dense(784, 400, relu)), Flux.Dense(400, 20), Flux.Dense(400, 20)
)
```

`AutoEncoderToolkit.JointGaussianLogEncoder`

— Method` (encoder::JointGaussianLogEncoder)(x)`

This method forward propagates the input `x`

through the `JointGaussianLogEncoder`

to compute the mean (`mu`

) and log standard deviation (`logσ`

) of the latent space.

**Arguments**

`x::Array{Float32}`

: The input data to be encoded.

**Returns**

- A NamedTuple
`(µ=µ, logσ=logσ,)`

where:`µ`

: The mean of the latent space. This is computed by passing the input through the encoder and subsequently through the`µ`

layer.`logσ`

: The log standard deviation of the latent space. This is computed by passing the input through the encoder and subsequently through the`logσ`

layer.

**Description**

This method allows for a direct call on an instance of `JointGaussianLogEncoder`

with the input data `x`

. It first processes the input through the encoder network, then maps the output of the last encoder layer to both the mean and log standard deviation of the latent space.

**Example**

```
je = JointGaussianLogEncoder(...)
mu, logσ = je(some_input)
```

**Note**

Ensure that the input x matches the expected dimensionality of the encoder's input layer.

## Decoders

The tree structure of the decoder types looks like this (🧱 represents concrete types):

`AbstractDecoder`

`AbstractDeterministicDecoder`

`AbstractVariationalDecoder`

`BernoulliDecoder`

🧱`CategoricalDecoder`

🧱`AbstractGaussianDecoder`

`SimpleGaussianDecoder`

🧱`AbstractGaussianLinearDecoder`

`AbstractGaussianLogDecoder`

`Decoder`

`AutoEncoderToolkit.Decoder`

— Type`struct Decoder{D<:Flux.Chain} <: AbstractDeterministicDecoder`

Default decoder function for deterministic autoencoders. The `decoder`

network is used to map the latent space representation directly back to the original data space.

**Fields**

`decoder::Flux.Chain`

: The primary neural network used to process the latent space representation and map it back to the data space.

**Example**

`dec = Decoder(Flux.Chain(Dense(20, 400, relu), Dense(400, 784)))`

`AutoEncoderToolkit.Decoder`

— Method`(decoder::Decoder)(z::AbstractArray)`

Forward propagate the encoded representation `z`

through the `Decoder`

to obtain the reconstructed input data.

**Arguments**

`z::AbstractArray`

: Encoded representation in the latent space.

**Returns**

`x_reconstructed`

: Reconstructed version of the original input data after decoding from the latent space.

**Description**

This method allows for a direct call on an instance of `Decoder`

with the encoded data `z`

. It runs the encoded representation through the decoder network and outputs the reconstructed version of the original input data.

**Example**

`julia dec = Decoder(...) x_reconstructed = dec(encoded_representation)`

`

**Note**

Ensure that the input z matches the expected dimensionality of the decoder's input layer.

`BernoulliDecoder`

`AutoEncoderToolkit.BernoulliDecoder`

— Type` BernoulliDecoder{D<:Flux.Chain} <: AbstractVariationalDecoder`

A decoder structure for variational autoencoders (VAEs) that models the output data as a Bernoulli distribution. This is typically used when the outputs of the decoder are probabilities.

**Fields**

`decoder::Flux.Chain`

: The primary neural network used to process the latent space and map it to the output (or reconstructed) space.

**Description**

`BernoulliDecoder`

represents a VAE decoder that models the output data as a Bernoulli distribution. It's commonly used when the outputs of the decoder are probabilities, such as in a binary classification task or when modeling binary data. Unlike a Gaussian decoder, there's no need for separate paths or operations on the mean or log standard deviation.

**Note**

Ensure the last layer of the decoder outputs a value between 0 and 1, as this is required for a Bernoulli distribution.

`AutoEncoderToolkit.BernoulliDecoder`

— Method` (decoder::BernoulliDecoder)(z::AbstractArray)`

Maps the given latent representation `z`

through the `BernoulliDecoder`

network to reconstruct the original input.

**Arguments**

`z::AbstractArray`

: The latent space representation to be decoded. This can be a vector or a matrix, where each column represents a separate sample from the latent space of a VAE.

**Returns**

- A NamedTuple
`(p=p,)`

where`p`

is an array representing the output of the decoder, which should resemble the original input to the VAE (post encoding and sampling from the latent space).

**Description**

This function processes the latent space representation `z`

using the neural network defined in the `BernoulliDecoder`

struct. The aim is to decode or reconstruct the original input from this representation.

**Note**

Ensure that the latent space representation z matches the expected input dimensionality for the BernoulliDecoder.

`CategoricalDecoder`

`AutoEncoderToolkit.CategoricalDecoder`

— Type`CategoricalDecoder{D<:Flux.Chain} <: AbstractVariationalDecoder`

A decoder structure for variational autoencoders (VAEs) that models the output data as a categorical distribution. This is typically used when the outputs of the decoder are categorical variables encoded as one-hot vectors.

**Fields**

`decoder::Flux.Chain`

: The primary neural network used to process the latent space and map it to the output (or reconstructed) space.

**Description**

`CategoricalDecoder`

represents a VAE decoder that models the output data as a categorical distribution. It's commonly used when the outputs of the decoder are categorical variables, such as in a multi-class one-hot encoded vectors. Unlike a Gaussian decoder, there's no need for separate paths or operations on the mean or log standard deviation.

**Note**

Ensure the last layer of the decoder outputs a probability distribution over the categories, as this is required for a categorical distribution. This can be done using a softmax activation function, for example.

`AutoEncoderToolkit.CategoricalDecoder`

— Method`(decoder::CategoricalDecoder)(z::AbstractArray)`

Maps the given latent representation `z`

through the `CategoricalDecoder`

network to reconstruct the original input.

**Arguments**

`z::AbstractArray`

: The latent space representation to be decoded. This can be a vector or a matrix, where each column represents a separate sample from the latent space of a VAE.

**Returns**

- A NamedTuple
`(p=p,)`

where`p`

is an array representing the output of the decoder, which should resemble the original input to the VAE (post encoding and sampling from the latent space).

**Description**

This function processes the latent space representation `z`

using the neural network defined in the `CategoricalDecoder`

struct. The aim is to decode or reconstruct the original input from this representation.

**Note**

Ensure that the latent space representation z matches the expected input dimensionality for the CategoricalDecoder.

`SimpleGaussianDecoder`

`AutoEncoderToolkit.SimpleGaussianDecoder`

— Type`SimpleGaussianDecoder{D} <: AbstractGaussianDecoder`

A straightforward decoder structure for variational autoencoders (VAEs) that contains only a single decoder network.

**Fields**

`decoder::Flux.Chain`

: The primary neural network used to process the latent space and map it to the output (or reconstructed) space.

**Description**

`SimpleGaussianDecoder`

represents a basic VAE decoder without explicit components for the latent space's mean (`µ`

) or log standard deviation (`logσ`

). It's commonly used when the VAE's latent space distribution is implicitly defined, and there's no need for separate paths or operations on the mean or log standard deviation.

`AutoEncoderToolkit.SimpleGaussianDecoder`

— Method`(decoder::SimpleGaussianDecoder)(z::AbstractVecOrMat)`

Maps the given latent representation `z`

through the `SimpleGaussianDecoder`

network to reconstruct the original input.

**Arguments**

`z::AbstractArray`

: The latent space representation to be decoded. This can be a vector or a matrix, where each column represents a separate sample from the latent space of a VAE.

**Returns**

- A NamedTuple
`(µ=µ,)`

where`µ`

is an array representing the output of the decoder, which should resemble the original input to the VAE (post encoding and sampling from the latent space).

**Description**

This function processes the latent space representation `z`

using the neural network defined in the `SimpleGaussianDecoder`

struct. The aim is to decode or reconstruct the original input from this representation.

**Example**

```
decoder = SimpleGaussianDecoder(...)
z = ... # some latent space representation
output = decoder(z)
```

**Note**

Ensure that the latent space representation z matches the expected input dimensionality for the SimpleGaussianDecoder.

`JointGaussianDecoder`

`AutoEncoderToolkit.JointGaussianDecoder`

— Type`JointGaussianDecoder{D<:Flux.Chain,L<:Flux.Dense} <: AbstractGaussianLinearDecoder`

An extended decoder structure for VAEs that incorporates separate layers for mapping from the latent space to both its mean (`µ`

) and standard deviation (`σ`

).

**Fields**

`decoder::Flux.Chain`

: The primary neural network used to process the latent space before determining its mean and log standard deviation.`µ::Flux.Dense`

: A dense layer that maps from the output of the`decoder`

to the mean of the latent space.`σ::Flux.Dense`

: A dense layer that maps from the output of the`decoder`

to the standard deviation of the latent space.

**Description**

`JointGaussianDecoder`

is tailored for VAE architectures where the same decoder network is used initially, and then splits into two separate paths for determining both the mean and standard deviation of the latent space.

`AutoEncoderToolkit.JointGaussianDecoder`

— Method` (decoder::JointGaussianDecoder)(z::AbstractArray)`

Maps the given latent representation `z`

through the `JointGaussianDecoder`

network to produce both the mean (`µ`

) and standard deviation (`σ`

).

**Arguments**

`z::AbstractArray`

: The latent space representation to be decoded. If array, the last dimension contains each of the latent space representations to be decoded.

**Returns**

- A NamedTuple
`(µ=µ, σ=σ,)`

where:`µ::AbstractArray`

: The mean representation obtained from the decoder.`σ::AbstractArray`

: The standard deviation representation obtained from the decoder.

**Description**

This function processes the latent space representation `z`

using the primary neural network of the `JointGaussianDecoder`

struct. It then separately maps the output of this network to the mean and standard deviation using the `µ`

and `σ`

dense layers, respectively.

**Example**

```
decoder = JointGaussianDecoder(...)
z = ... # some latent space representation
output = decoder(z)
```

**Note**

Ensure that the latent space representation z matches the expected input dimensionality for the JointGaussianDecoder.

`JointGaussianLogDecoder`

`AutoEncoderToolkit.JointGaussianLogDecoder`

— Type`JointGaussianLogDecoder{D<:Flux.Chain,L<:Flux.Dense} <: AbstractGaussianLogDecoder`

An extended decoder structure for VAEs that incorporates separate layers for mapping from the latent space to both its mean (`µ`

) and log standard deviation (`logσ`

).

**Fields**

`decoder::Flux.Chain`

: The primary neural network used to process the latent space before determining its mean and log standard deviation.`µ::Flux.Dense`

: A dense layer that maps from the output of the`decoder`

to the mean of the latent space.`logσ::Flux.Dense`

: A dense layer that maps from the output of the`decoder`

to the log standard deviation of the latent space.

**Description**

`JointGaussianLogDecoder`

is tailored for VAE architectures where the same decoder network is used initially, and then splits into two separate paths for determining both the mean and log standard deviation of the latent space.

`AutoEncoderToolkit.JointGaussianLogDecoder`

— Method` (decoder::JointGaussianLogDecoder)(z::AbstractArray)`

Maps the given latent representation `z`

through the `JointGaussianLogDecoder`

network to produce both the mean (`µ`

) and log standard deviation (`logσ`

).

**Arguments**

`z::AbstractArray`

: The latent space representation to be decoded. If array, the last dimension contains each of the latent space representations.

**Returns**

- A NamedTuple
`(µ=µ, logσ=logσ,)`

where:`µ::Array`

: The mean representation obtained from the decoder.`logσ::Array`

: The log standard deviation representation obtained from the decoder.

**Description**

This function processes the latent space representation `z`

using the primary neural network of the `JointGaussianLogDecoder`

struct. It then separately maps the output of this network to the mean and log standard deviation using the `µ`

and `logσ`

dense layers, respectively.

**Example**

```
decoder = JointGaussianLogDecoder(...)
z = ... # some latent space representation
output = decoder(z)
```

**Note**

Ensure that the latent space representation z matches the expected input dimensionality for the JointGaussianLogDecoder.

`SplitGaussianDecoder`

`AutoEncoderToolkit.SplitGaussianDecoder`

— Type`SplitGaussianDecoder{D<:Flux.Chain} <: AbstractGaussianLinearDecoder`

A specialized decoder structure for VAEs that uses distinct neural networks for determining the mean (`µ`

) and standard deviation (`logσ`

) of the latent space.

**Fields**

`decoder_µ::Flux.Chain`

: A neural network dedicated to processing the latent space and mapping it to its mean.`decoder_σ::Flux.Chain`

: A neural network dedicated to processing the latent space and mapping it to its standard deviation.

**Description**

`SplitGaussianDecoder`

is designed for VAE architectures where separate decoder networks are preferred for computing the mean and log standard deviation, ensuring that each has its own distinct set of parameters and transformation logic.

`AutoEncoderToolkit.SplitGaussianDecoder`

— Method` (decoder::SplitGaussianDecoder)(z::AbstractArray)`

Maps the given latent representation `z`

through the separate networks of the `SplitGaussianDecoder`

to produce both the mean (`µ`

) and standard deviation (`σ`

).

**Arguments**

`z::AbstractArray`

: The latent space representation to be decoded. If array, the last dimension contains each of the latent space representations to be decoded.

**Returns**

- A NamedTuple
`(µ=µ, σ=σ,)`

where:`µ::AbstractArray`

: The mean representation obtained using the dedicated`decoder_µ`

network.`σ::AbstractArray`

: The standard deviation representation obtained using the dedicated`decoder_σ`

network.

**Description**

This function processes the latent space representation `z`

through two distinct neural networks within the `SplitGaussianDecoder`

struct. The `decoder_µ`

network is used to produce the mean representation, while the `decoder_σ`

network is utilized for the standard deviation.

**Example**

```
decoder = SplitGaussianDecoder(...)
z = ... # some latent space representation
output = decoder(z)
```

**Note**

Ensure that the latent space representation z matches the expected input dimensionality for both networks in the SplitGaussianDecoder.

`SplitGaussianLogDecoder`

`AutoEncoderToolkit.SplitGaussianLogDecoder`

— Type`SplitGaussianLogDecoder{D<:Flux.Chain} <: AbstractGaussianLogDecoder`

A specialized decoder structure for VAEs that uses distinct neural networks for determining the mean (`µ`

) and log standard deviation (`logσ`

) of the latent space.

**Fields**

`decoder_µ::Flux.Chain`

: A neural network dedicated to processing the latent space and mapping it to its mean.`decoder_logσ::Flux.Chain`

: A neural network dedicated to processing the latent space and mapping it to its log standard deviation.

**Description**

`SplitGaussianLogDecoder`

is designed for VAE architectures where separate decoder networks are preferred for computing the mean and log standard deviation, ensuring that each has its own distinct set of parameters and transformation logic.

`AutoEncoderToolkit.SplitGaussianLogDecoder`

— Method` (decoder::SplitGaussianLogDecoder)(z::AbstractArray)`

Maps the given latent representation `z`

through the separate networks of the `SplitGaussianLogDecoder`

to produce both the mean (`µ`

) and log standard deviation (`logσ`

).

**Arguments**

`z::AbstractArray`

: The latent space representation to be decoded. If array, the last dimension contains each of the latent space representations to be decoded.

**Returns**

- A NamedTuple
`(µ=µ, logσ=logσ,)`

where:`µ::AbstractArray`

: The mean representation obtained using the dedicated`decoder_µ`

network.`logσ::AbstractArray`

: The log standard deviation representation obtained using the dedicated`decoder_logσ`

network.

**Description**

This function processes the latent space representation `z`

through two distinct neural networks within the `SplitGaussianLogDecoder`

struct. The `decoder_µ`

network is used to produce the mean representation, while the `decoder_logσ`

network is utilized for the log standard deviation.

**Example**

```
decoder = SplitGaussianLogDecoder(...)
z = ... # some latent space representation
output = decoder(z))
```

**Note**

Ensure that the latent space representation z matches the expected input dimensionality for both networks in the SplitGaussianLogDecoder.

## Default initializations

The package provides a set of functions to initialize encoder and decoder architectures. Although it gives the user less flexibility, it can be useful for quick prototyping.

### Encoder initializations

`AutoEncoderToolkit.Encoder`

— Method```
Encoder(n_input, n_latent, latent_activation, encoder_neurons,
encoder_activation; init=Flux.glorot_uniform)
```

Construct and initialize an `Encoder`

struct that defines an encoder network for a deterministic autoencoder.

**Arguments**

`n_input::Int`

: The dimensionality of the input data.`n_latent::Int`

: The dimensionality of the latent space.`encoder_neurons::Vector{<:Int}`

: A vector specifying the number of neurons in each layer of the encoder network.`encoder_activation::Vector{<:Function}`

: Activation functions corresponding to each layer in the`encoder_neurons`

.`latent_activation::Function`

: Activation function for the latent space layer.

**Optional Keyword Arguments**

`init::Function=Flux.glorot_uniform`

: The initialization function used for the neural network weights.

**Returns**

- An
`Encoder`

struct initialized based on the provided arguments.

**Examples**

`julia encoder = Encoder(784, 20, tanh, [400], [relu])`

`

**Notes**

The length of encoder*neurons should match the length of encoder*activation, ensuring that each layer in the encoder has a corresponding activation function.

`AutoEncoderToolkit.JointGaussianLogEncoder`

— Method```
JointGaussianLogEncoder(n_input, n_latent, encoder_neurons, encoder_activation,
latent_activation; init=Flux.glorot_uniform)
```

Construct and initialize a `JointGaussianLogEncoder`

struct that defines an encoder network for a variational autoencoder.

**Arguments**

`n_input::Int`

: The dimensionality of the input data.`n_latent::Int`

: The dimensionality of the latent space.`encoder_neurons::Vector{<:Int}`

: A vector specifying the number of neurons in each layer of the encoder network.`encoder_activation::Vector{<:Function}`

: Activation functions corresponding to each layer in the`encoder_neurons`

.`latent_activation::Function`

: Activation function for the latent space layers (both µ and logσ).

**Optional Keyword Arguments**

`init::Function=Flux.glorot_uniform`

: The initialization function used for the neural network weights.

**Returns**

- A
`JointGaussianLogEncoder`

struct initialized based on the provided arguments.

**Examples**

`encoder = JointGaussianLogEncoder(784, 20, [400], [relu], tanh)`

**Notes**

The length of encoder*neurons should match the length of encoder*activation, ensuring that each layer in the encoder has a corresponding activation function.

`AutoEncoderToolkit.JointGaussianLogEncoder`

— Method```
JointGaussianLogEncoder(n_input, n_latent, encoder_neurons, encoder_activation,
latent_activation; init=Flux.glorot_uniform)
```

Construct and initialize a `JointGaussianLogEncoder`

struct that defines an encoder network for a variational autoencoder.

**Arguments**

`n_input::Int`

: The dimensionality of the input data.`n_latent::Int`

: The dimensionality of the latent space.`encoder_neurons::Vector{<:Int}`

: A vector specifying the number of neurons in each layer of the encoder network.`encoder_activation::Vector{<:Function}`

: Activation functions corresponding to each layer in the`encoder_neurons`

.`latent_activation::Vector{<:Function}`

: Activation functions for the latent space layers (both µ and logσ).

**Optional Keyword Arguments**

`init::Function=Flux.glorot_uniform`

: The initialization function used for the neural network weights.

**Returns**

- A
`JointGaussianLogEncoder`

struct initialized based on the provided arguments.

**Examples**

`encoder = JointGaussianLogEncoder(784, 20, [400], [relu], tanh)`

**Notes**

The length of encoder*neurons should match the length of encoder*activation, ensuring that each layer in the encoder has a corresponding activation function.

`AutoEncoderToolkit.JointGaussianEncoder`

— Method```
JointGaussianEncoder(n_input, n_latent, encoder_neurons, encoder_activation,
latent_activation; init=Flux.glorot_uniform)
```

Construct and initialize a `JointGaussianLogEncoder`

struct that defines an encoder network for a variational autoencoder.

**Arguments**

`n_input::Int`

: The dimensionality of the input data.`n_latent::Int`

: The dimensionality of the latent space.`encoder_neurons::Vector{<:Int}`

: A vector specifying the number of neurons in each layer of the encoder network.`encoder_activation::Vector{<:Function}`

: Activation functions corresponding to each layer in the`encoder_neurons`

.`latent_activation::Vector{<:Function}`

: Activation function for the latent space layers. This vector must contain the activation for both µ and logσ.

**Optional Keyword Arguments**

`init::Function=Flux.glorot_uniform`

: The initialization function used for the neural network weights.

**Returns**

- A
`JointGaussianEncoder`

struct initialized based on the provided arguments.

**Examples**

`encoder = JointGaussianEncoder(784, 20, [400], [relu], [tanh, softplus])`

**Notes**

*neurons should match the length of encoder*activation, ensuring that each layer in the encoder has a corresponding activation function.

### Decoder initializations

`AutoEncoderToolkit.Decoder`

— Method```
Decoder(n_input, n_latent, decoder_neurons, decoder_activation,
output_activation; init=Flux.glorot_uniform)
```

Construct and initialize a `Decoder`

struct that defines a decoder network for a deterministic autoencoder.

**Arguments**

`n_input::Int`

: The dimensionality of the output data (which typically matches the input data dimensionality of the autoencoder).`n_latent::Int`

: The dimensionality of the latent space.`decoder_neurons::Vector{<:Int}`

: A vector specifying the number of neurons in each layer of the decoder network.`decoder_activation::Vector{<:Function}`

: Activation functions corresponding to each layer in the`decoder_neurons`

.`output_activation::Function`

: Activation function for the final output layer.

**Optional Keyword Arguments**

`init::Function=Flux.glorot_uniform`

: The initialization function used for the neural network weights.

**Returns**

- A
`Decoder`

struct initialized based on the provided arguments.

**Examples**

`decoder = Decoder(784, 20, sigmoid, [400], [relu])`

**Notes**

The length of decoder*neurons should match the length of decoder*activation, ensuring that each layer in the decoder has a corresponding activation function.

`AutoEncoderToolkit.SimpleGaussianDecoder`

— Method```
SimpleGaussianDecoder(
n_input, n_latent, decoder_neurons,
decoder_activation, output_activation;
init=Flux.glorot_uniform
)
```

Constructs and initializes a `SimpleGaussianDecoder`

object designed for variational autoencoders (VAEs). This function sets up a straightforward decoder network that maps from a latent space to an output space.

**Arguments**

`n_input::Int`

: Dimensionality of the output data (or the data to be reconstructed).`n_latent::Int`

: Dimensionality of the latent space.`decoder_neurons::Vector{<:Int}`

: Vector of layer sizes for the decoder network, not including the input latent layer and the final output layer.`decoder_activation::Vector{<:Function}`

: Activation functions for each decoder layer, not including the final output layer.`output_activation::Function`

: Activation function for the final output layer.

**Optional Keyword Arguments**

`init::Function=Flux.glorot_uniform`

: Initialization function for the network parameters.

**Returns**

A `SimpleGaussianDecoder`

object with the specified architecture and initialized weights.

**Description**

This function constructs a `SimpleGaussianDecoder`

object, setting up its decoder network based on the provided specifications. The architecture begins with a dense layer mapping from the latent space, goes through a sequence of middle layers if specified, and finally maps to the output space.

The function ensures that there are appropriate activation functions provided for each layer in the `decoder_neurons`

and checks for potential mismatches in length.

**Example**

```
n_input = 28*28
n_latent = 64
decoder_neurons = [128, 256]
decoder_activation = [relu, relu]
output_activation = sigmoid
decoder = SimpleGaussianDecoder(
n_input, n_latent, decoder_neurons, decoder_activation, output_activation
)
```

**Note**

Ensure that the lengths of decoder*neurons and decoder*activation match, excluding the output layer.

`AutoEncoderToolkit.JointGaussianLogDecoder`

— Method```
JointGaussianLogDecoder(n_input, n_latent, decoder_neurons, decoder_activation,
latent_activation; init=Flux.glorot_uniform)
```

Constructs and initializes a `JointGaussianLogDecoder`

object for variational autoencoders (VAEs). This function sets up a decoder network that first processes the latent space and then maps it separately to both its mean (`µ`

) and log standard deviation (`logσ`

).

**Arguments**

`n_input::Int`

: Dimensionality of the output data (or the data to be reconstructed).`n_latent::Int`

: Dimensionality of the latent space.`decoder_neurons::Vector{<:Int}`

: Vector of layer sizes for the primary decoder network, not including the input latent layer.`decoder_activation::Vector{<:Function}`

: Activation functions for each primary decoder layer.`output_activation::Function`

: Activation function for the mean (`µ`

) and log standard deviation (`logσ`

) layers.

**Optional Keyword Arguments**

`init::Function=Flux.glorot_uniform`

: Initialization function for the network parameters.

**Returns**

A `JointGaussianLogDecoder`

object with the specified architecture and initialized weights.

**Description**

This function constructs a `JointGaussianLogDecoder`

object, setting up its primary decoder network based on the provided specifications. The architecture begins with a dense layer mapping from the latent space and goes through a sequence of middle layers if specified. After processing the latent space through the primary decoder, it then maps separately to both its mean (`µ`

) and log standard deviation (`logσ`

).

**Example**

```
n_input = 28*28
n_latent = 64
decoder_neurons = [128, 256]
decoder_activation = [relu, relu]
output_activation = tanh
decoder = JointGaussianLogDecoder(
n_input, n_latent, decoder_neurons, decoder_activation, output_activation
)
```

**Note**

Ensure that the lengths of decoder*neurons and decoder*activation match.

`AutoEncoderToolkit.JointGaussianLogDecoder`

— Method```
JointGaussianLogDecoder(n_input, n_latent, decoder_neurons, decoder_activation,
latent_activation; init=Flux.glorot_uniform)
```

Constructs and initializes a `JointGaussianLogDecoder`

object for variational autoencoders (VAEs). This function sets up a decoder network that first processes the latent space and then maps it separately to both its mean (`µ`

) and log standard deviation (`logσ`

).

**Arguments**

`n_input::Int`

: Dimensionality of the output data (or the data to be reconstructed).`n_latent::Int`

: Dimensionality of the latent space.`decoder_neurons::Vector{<:Int}`

: Vector of layer sizes for the primary decoder network, not including the input latent layer.`decoder_activation::Vector{<:Function}`

: Activation functions for each primary decoder layer.`output_activation::Vector{<:Function}`

: Activation functions for the mean (`µ`

) and log standard deviation (`logσ`

) layers.

**Optional Keyword Arguments**

`init::Function=Flux.glorot_uniform`

: Initialization function for the network parameters.

**Returns**

A `JointGaussianLogDecoder`

object with the specified architecture and initialized weights.

**Description**

This function constructs a `JointGaussianLogDecoder`

object, setting up its primary decoder network based on the provided specifications. The architecture begins with a dense layer mapping from the latent space and goes through a sequence of middle layers if specified. After processing the latent space through the primary decoder, it then maps separately to both its mean (`µ`

) and log standard deviation (`logσ`

).

**Example**

```
n_input = 28*28
n_latent = 64
decoder_neurons = [128, 256]
decoder_activation = [relu, relu]
output_activation = [tanh, identity]
decoder = JointGaussianLogDecoder(
n_input, n_latent, decoder_neurons, decoder_activation, latent_activation
)
```

**Note**

Ensure that the lengths of decoder*neurons and decoder*activation match.

`AutoEncoderToolkit.JointGaussianDecoder`

— Method```
JointGaussianDecoder(n_input, n_latent, decoder_neurons, decoder_activation,
latent_activation; init=Flux.glorot_uniform)
```

Constructs and initializes a `JointGaussianLogDecoder`

object for variational autoencoders (VAEs). This function sets up a decoder network that first processes the latent space and then maps it separately to both its mean (`µ`

) and log standard deviation (`logσ`

).

**Arguments**

`n_input::Int`

: Dimensionality of the output data (or the data to be reconstructed).`n_latent::Int`

: Dimensionality of the latent space.`decoder_neurons::Vector{<:Int}`

: Vector of layer sizes for the primary decoder network, not including the input latent layer.`decoder_activation::Vector{<:Function}`

: Activation functions for each primary decoder layer.`output_activation::Function`

: Activation function for the mean (`µ`

) and log standard deviation (`logσ`

) layers.

**Optional Keyword Arguments**

`init::Function=Flux.glorot_uniform`

: Initialization function for the network parameters.

**Returns**

A `JointGaussianDecoder`

object with the specified architecture and initialized weights.

**Description**

This function constructs a `JointGaussianDecoder`

object, setting up its primary decoder network based on the provided specifications. The architecture begins with a dense layer mapping from the latent space and goes through a sequence of middle layers if specified. After processing the latent space through the primary decoder, it then maps separately to both its mean (`µ`

) and standard deviation (`σ`

).

**Example**

```
n_input = 28*28
n_latent = 64
decoder_neurons = [128, 256]
decoder_activation = [relu, relu]
output_activation = tanh
decoder = JointGaussianDecoder(
n_input, n_latent, decoder_neurons, decoder_activation, output_activation
)
```

**Note**

Ensure that the lengths of decoder*neurons and decoder*activation match.

`AutoEncoderToolkit.JointGaussianDecoder`

— Method```
JointGaussianDecoder(n_input, n_latent, decoder_neurons, decoder_activation,
latent_activation; init=Flux.glorot_uniform)
```

Constructs and initializes a `JointGaussianDecoder`

object for variational autoencoders (VAEs). This function sets up a decoder network that first processes the latent space and then maps it separately to both its mean (`µ`

) and standard deviation (`σ`

).

**Arguments**

`n_input::Int`

: Dimensionality of the output data (or the data to be reconstructed).`n_latent::Int`

: Dimensionality of the latent space.`decoder_neurons::Vector{<:Int}`

: Vector of layer sizes for the primary decoder network, not including the input latent layer.`decoder_activation::Vector{<:Function}`

: Activation functions for each primary decoder layer.`output_activation::Function`

: Activation function for the mean (`µ`

) and standard deviation (`σ`

) layers.

**Optional Keyword Arguments**

`init::Function=Flux.glorot_uniform`

: Initialization function for the network parameters.

**Returns**

A `JointGaussianDecoder`

object with the specified architecture and initialized weights.

**Description**

This function constructs a `JointGaussianDecoder`

object, setting up its primary decoder network based on the provided specifications. The architecture begins with a dense layer mapping from the latent space and goes through a sequence of middle layers if specified. After processing the latent space through the primary decoder, it then maps separately to both its mean (`µ`

) and standard deviation (`σ`

).

**Example**

```
n_input = 28*28
n_latent = 64
decoder_neurons = [128, 256]
decoder_activation = [relu, relu]
latent_activation = [tanh, softplus]
decoder = JointGaussianDecoder(
n_input, n_latent, decoder_neurons, decoder_activation, latent_activation
)
```

**Note**

Ensure that the lengths of decoder*neurons and decoder*activation match.

`AutoEncoderToolkit.SplitGaussianLogDecoder`

— Method```
SplitGaussianLogDecoder(n_input, n_latent, µ_neurons, µ_activation, logσ_neurons,
logσ_activation; init=Flux.glorot_uniform)
```

Constructs and initializes a `SplitGaussianLogDecoder`

object for variational autoencoders (VAEs). This function sets up two distinct decoder networks, one dedicated for determining the mean (`µ`

) and the other for the log standard deviation (`logσ`

) of the latent space.

**Arguments**

`n_input::Int`

: Dimensionality of the output data (or the data to be reconstructed).`n_latent::Int`

: Dimensionality of the latent space.`µ_neurons::Vector{<:Int}`

: Vector of layer sizes for the`µ`

decoder network, not including the input latent layer.`µ_activation::Vector{<:Function}`

: Activation functions for each`µ`

decoder layer.`logσ_neurons::Vector{<:Int}`

: Vector of layer sizes for the`logσ`

decoder network, not including the input latent layer.`logσ_activation::Vector{<:Function}`

: Activation functions for each`logσ`

decoder layer.

**Optional Keyword Arguments**

`init::Function=Flux.glorot_uniform`

: Initialization function for the network parameters.

**Returns**

A `SplitGaussianLogDecoder`

object with two distinct networks initialized with the specified architectures and weights.

**Description**

This function constructs a `SplitGaussianLogDecoder`

object, setting up two separate decoder networks based on the provided specifications. The first network, dedicated to determining the mean (`µ`

), and the second for the log standard deviation (`logσ`

), both begin with a dense layer mapping from the latent space and go through a sequence of middle layers if specified.

**Example**

```
n_latent = 64
µ_neurons = [128, 256]
µ_activation = [relu, relu]
logσ_neurons = [128, 256]
logσ_activation = [relu, relu]
decoder = SplitGaussianLogDecoder(
n_latent, µ_neurons, µ_activation, logσ_neurons, logσ_activation
)
```

**Notes**

- Ensure that the lengths of µ
*neurons with µ*activation and logσ*neurons with logσ*activation match respectively. - If µ
*neurons[end] or logσ*neurons[end] do not match n_input, the function automatically changes this number to match the right dimensionality

`AutoEncoderToolkit.SplitGaussianDecoder`

— Method```
SplitGaussianDecoder(n_input, n_latent, µ_neurons, µ_activation, logσ_neurons,
logσ_activation; init=Flux.glorot_uniform)
```

Constructs and initializes a `SplitGaussianDecoder`

object for variational autoencoders (VAEs). This function sets up two distinct decoder networks, one dedicated for determining the mean (`µ`

) and the other for the standard deviation (`σ`

) of the latent space.

**Arguments**

`n_input::Int`

: Dimensionality of the output data (or the data to be reconstructed).`n_latent::Int`

: Dimensionality of the latent space.`µ_neurons::Vector{<:Int}`

: Vector of layer sizes for the`µ`

decoder network, not including the input latent layer.`µ_activation::Vector{<:Function}`

: Activation functions for each`µ`

decoder layer.`σ_neurons::Vector{<:Int}`

: Vector of layer sizes for the`σ`

decoder network, not including the input latent layer.`σ_activation::Vector{<:Function}`

: Activation functions for each`σ`

decoder layer.

**Optional Keyword Arguments**

`init::Function=Flux.glorot_uniform`

: Initialization function for the network parameters.

**Returns**

A `SplitGaussianDecoder`

object with two distinct networks initialized with the specified architectures and weights.

**Description**

This function constructs a `SplitGaussianDecoder`

object, setting up two separate decoder networks based on the provided specifications. The first network, dedicated to determining the mean (`µ`

), and the second for the standard deviation (`σ`

), both begin with a dense layer mapping from the latent space and go through a sequence of middle layers if specified.

**Example**

```
n_latent = 64
µ_neurons = [128, 256]
µ_activation = [relu, relu]
σ_neurons = [128, 256]
σ_activation = [relu, relu]
decoder = SplitGaussianDecoder(
n_latent, µ_neurons, µ_activation, σ_neurons, σ_activation
)
```

**Notes**

- Ensure that the lengths of µ
*neurons with µ*activation and σ*neurons with σ*activation match respectively. - If µ
*neurons[end] or σ*neurons[end] do not match n_input, the function automatically changes this number to match the right dimensionality - Ensure that σ_neurons[end] maps to a
**positive**value. Activation functions such as`softplus`

are needed to guarantee the positivity of the standard deviation.

`AutoEncoderToolkit.BernoulliDecoder`

— Method```
BernoulliDecoder(n_input, n_latent, decoder_neurons, decoder_activation,
output_activation; init=Flux.glorot_uniform)
```

Constructs and initializes a `BernoulliDecoder`

object designed for variational autoencoders (VAEs). This function sets up a decoder network that maps from a latent space to an output space.

**Arguments**

`n_input::Int`

: Dimensionality of the output data (or the data to be reconstructed).`n_latent::Int`

: Dimensionality of the latent space.`decoder_neurons::Vector{<:Int}`

: Vector of layer sizes for the decoder network, not including the input latent layer and the final output layer.`decoder_activation::Vector{<:Function}`

: Activation functions for each decoder layer, not including the final output layer.`output_activation::Function`

: Activation function for the final output layer.

**Optional Keyword Arguments**

`init::Function=Flux.glorot_uniform`

: Initialization function for the network parameters.

**Returns**

A `BernoulliDecoder`

object with the specified architecture and initialized weights.

**Description**

This function constructs a `BernoulliDecoder`

object, setting up its decoder network based on the provided specifications. The architecture begins with a dense layer mapping from the latent space, goes through a sequence of middle layers if specified, and finally maps to the output space.

The function ensures that there are appropriate activation functions provided for each layer in the `decoder_neurons`

and checks for potential mismatches in length.

**Example**

```
n_input = 28*28
n_latent = 64
decoder_neurons = [128, 256]
decoder_activation = [relu, relu]
output_activation = sigmoid
decoder = BernoulliDecoder(
n_input,
n_latent,
decoder_neurons,
decoder_activation,
output_activation
)
```

**Note**

Ensure that the lengths of decoder*neurons and decoder*activation match, excluding the output layer. Also, the output activation function should return values between 0 and 1, as the decoder models the output data as a Bernoulli distribution.

`AutoEncoderToolkit.CategoricalDecoder`

— Method```
CategoricalDecoder(
size_input, n_latent, decoder_neurons, decoder_activation,
output_activation; init=Flux.glorot_uniform
)
```

Constructs and initializes a `CategoricalDecoder`

object designed for variational autoencoders (VAEs). This function sets up a decoder network that maps from a latent space to an output space.

**Arguments**

`size_input::AbstractVector{<:Int}`

: Dimensionality of the output data (or the data to be reconstructed) in the form of a vector where each element represents the size of a dimension.`n_latent::Int`

: Dimensionality of the latent space.`decoder_neurons::Vector{<:Int}`

: Vector of layer sizes for the decoder network, not including the input latent layer and the final output layer.`decoder_activation::Vector{<:Function}`

: Activation functions for each decoder layer, not including the final output layer.`output_activation::Function`

: Activation function for the final output layer.

**Optional Keyword Arguments**

`init::Function=Flux.glorot_uniform`

: Initialization function for the network parameters.

**Returns**

A `CategoricalDecoder`

object with the specified architecture and initialized weights.

**Description**

This function constructs a `CategoricalDecoder`

object, setting up its decoder network based on the provided specifications. The architecture begins with a dense layer mapping from the latent space, goes through a sequence of middle layers if specified, and finally maps to the output space.

The function ensures that there are appropriate activation functions provided for each layer in the `decoder_neurons`

and checks for potential mismatches in length.

The output layer uses the identity function as its activation function, and the output is reshaped to match the dimensions specified in `size_input`

. The `output_activation`

function is then applied over the first dimension of the reshaped output.

**Note**

Ensure that the lengths of decoder*neurons and decoder*activation match, excluding the output layer. Also, the output activation function should return values that can be interpreted as probabilities, as the decoder models the output data as a categorical distribution.

`AutoEncoderToolkit.CategoricalDecoder`

— Method```
CategoricalDecoder(
n_input, n_latent, decoder_neurons, decoder_activation,
output_activation; init=Flux.glorot_uniform
)
```

Constructs and initializes a `CategoricalDecoder`

object designed for variational autoencoders (VAEs). This function sets up a decoder network that maps from a latent space to an output space.

**Arguments**

`size_input::AbstractVector{<:Int}`

: Dimensionality of the output data (or the data to be reconstructed).`n_latent::Int`

: Dimensionality of the latent space.`decoder_neurons::Vector{<:Int}`

: Vector of layer sizes for the decoder network, not including the input latent layer and the final output layer.`decoder_activation::Vector{<:Function}`

: Activation functions for each decoder layer, not including the final output layer.`output_activation::Function`

: Activation function for the final output layer.

**Optional Keyword Arguments**

`init::Function=Flux.glorot_uniform`

: Initialization function for the network parameters.

**Returns**

A `CategoricalDecoder`

object with the specified architecture and initialized weights.

**Description**

This function constructs a `CategoricalDecoder`

object, setting up its decoder network based on the provided specifications. The architecture begins with a dense layer mapping from the latent space, goes through a sequence of middle layers if specified, and finally maps to the output space.

`decoder_neurons`

and checks for potential mismatches in length.

**Note**

Ensure that the lengths of decoder*neurons and decoder*activation match, excluding the output layer. Also, the output activation function should return values that can be interpreted as probabilities, as the decoder models the output data as a categorical distribution.

## Probabilistic functions

Given the probability-centered design of `AutoEncoderToolkit.jl`

, each variational encoder and decoder has an associated probabilistic function used when computing the evidence lower bound (ELBO). The following functions are available:

`AutoEncoderToolkit.encoder_logposterior`

— Function```
encoder_logposterior(
z::AbstractVector,
encoder::AbstractGaussianLogEncoder,
encoder_output::NamedTuple
)
```

Computes the log-posterior of the latent variable `z`

given the encoder output under a Gaussian distribution with mean and standard deviation given by the encoder.

**Arguments**

`z::AbstractVector`

: The latent variable for which the log-posterior is to be computed.`encoder::AbstractGaussianLogEncoder`

: The encoder of the VAE, which is not used in the computation of the log-posterior. This argument is only used to know which method to call.`encoder_output::NamedTuple`

: The output of the encoder, which includes the mean and log standard deviation of the Gaussian distribution.

**Returns**

`logposterior::T`

: The computed log-posterior of the latent variable`z`

given the encoder output.

**Description**

The function computes the log-posterior of the latent variable `z`

given the encoder output under a Gaussian distribution. The mean and log standard deviation of the Gaussian distribution are extracted from the `encoder_output`

. The standard deviation is then computed by exponentiating the log standard deviation. The log-posterior is computed using the formula for the log-posterior of a Gaussian distribution.

**Note**

Ensure the dimensions of `z`

match the expected input dimensionality of the `encoder`

.

```
encoder_logposterior(
z::AbstractMatrix,
encoder::AbstractGaussianLogEncoder,
encoder_output::NamedTuple
)
```

Computes the log-posterior of the latent variable `z`

given the encoder output under a Gaussian distribution with mean and standard deviation given by the encoder.

**Arguments**

`z::AbstractMatrix`

: The latent variable for which the log-posterior is to be computed. Each column of`z`

represents a different data point.`encoder::AbstractGaussianLogEncoder`

: The encoder of the VAE, which is not used in the computation of the log-posterior. This argument is only used to know which method to call.`encoder_output::NamedTuple`

: The output of the encoder, which includes the mean and log standard deviation of the Gaussian distribution.

**Returns**

`logposterior::Vector`

: The computed log-posterior of the latent variable`z`

given the encoder output. Each element of the vector corresponds to a different data point.

**Description**

The function computes the log-posterior of the latent variable `z`

given the encoder output under a Gaussian distribution. The mean and log standard deviation of the Gaussian distribution are extracted from the `encoder_output`

. The standard deviation is then computed by exponentiating the log standard deviation. The log-posterior is computed using the formula for the log-posterior of a Gaussian distribution.

**Note**

Ensure the dimensions of `z`

match the expected input dimensionality of the `encoder`

.

```
encoder_logposterior(
z::AbstractVector,
encoder::AbstractGaussianLogEncoder,
encoder_output::NamedTuple,
index::Int
)
```

Computes the log-posterior of the latent variable `z`

for a single data point specified by `index`

given the encoder output under a Gaussian distribution with mean and standard deviation given by the encoder.

**Arguments**

`z::AbstractVector`

: The latent variable for which the log-posterior is to be computed.`encoder::AbstractGaussianLogEncoder`

: The encoder of the VAE, which is not used in the computation of the log-posterior. This argument is only used to know which method to call.`encoder_output::NamedTuple`

: The output of the encoder, which includes the mean and log standard deviation of the Gaussian distribution for multiple data points.`index::Int`

: The index of the data point for which the log-posterior is to be computed.

**Returns**

`logposterior::Float32`

: The computed log-posterior of the latent variable`z`

for the specified data point given the encoder output.

**Description**

The function computes the log-posterior of the latent variable `z`

for a single data point specified by `index`

given the encoder output under a Gaussian distribution. The mean and log standard deviation of the Gaussian distribution are extracted from the `encoder_output`

for the specified data point. The standard deviation is then computed by exponentiating the log standard deviation. The log-posterior is computed using the formula for the log-posterior of a Gaussian distribution.

**Note**

Ensure the dimensions of `z`

match the expected input dimensionality of the `encoder`

. Also, ensure that `index`

is a valid index for the data points in `encoder_output`

.

`AutoEncoderToolkit.encoder_kl`

— Function```
encoder_kl(
encoder::AbstractGaussianLogEncoder,
encoder_output::NamedTuple
)
```

Calculate the Kullback-Leibler (KL) divergence between the approximate posterior distribution and the prior distribution in a variational autoencoder with a Gaussian encoder.

The KL divergence for a Gaussian encoder with mean `encoder_µ`

and log standard deviation `encoder_logσ`

is computed against a standard Gaussian prior.

**Arguments**

`encoder::AbstractGaussianLogEncoder`

: Encoder network. This argument is not used in the computation of the KL divergence, but is included to allow for multiple encoder types to be used with the same function.`encoder_output::NamedTuple`

:`NamedTuple`

containing all the encoder outputs. It should have fields`μ`

and`logσ`

representing the mean and log standard deviation of the encoder's output.

**Returns**

`kl_div::Union{Number, Vector}`

: The KL divergence for the entire batch of data points. If`encoder_µ`

is a vector,`kl_div`

is a scalar. If`encoder_µ`

is a matrix,`kl_div`

is a vector where each element corresponds to the KL divergence for a batch of data points.

**Note**

- It is assumed that the mapping from data space to latent parameters (
`encoder_µ`

and`encoder_logσ`

) has been performed prior to calling this function. The`encoder`

argument is provided to indicate the type of decoder network used, but it is not used within the function itself.

`AutoEncoderToolkit.spherical_logprior`

— Function`spherical_logprior(z::AbstractVector, σ::Real=1.0f0)`

Computes the log-prior of the latent variable `z`

under a spherical Gaussian distribution with zero mean and standard deviation `σ`

.

**Arguments**

`z::AbstractVector`

: The latent variable for which the log-prior is to be computed.`σ::T=1.0f0`

: The standard deviation of the spherical Gaussian distribution. Defaults to`1.0f0`

.

**Returns**

`logprior::T`

: The computed log-prior of the latent variable`z`

.

**Description**

The function computes the log-prior of the latent variable `z`

under a spherical Gaussian distribution with zero mean and standard deviation `σ`

. The log-prior is computed using the formula for the log-prior of a Gaussian distribution.

**Note**

Ensure the dimension of `z`

matches the expected dimensionality of the latent space.

`spherical_logprior(z::AbstractMatrix, σ::Real=1.0f0)`

Computes the log-prior of the latent variable `z`

under a spherical Gaussian distribution with zero mean and standard deviation `σ`

.

**Arguments**

`z::AbstractMatrix`

: The latent variable for which the log-prior is to be computed. Each column of`z`

represents a different latent variable.`σ::Real=1.0f0`

: The standard deviation of the spherical Gaussian distribution. Defaults to`1.0f0`

.

**Returns**

`logprior::T`

: The computed log-prior(s) of the latent variable`z`

.

**Description**

The function computes the log-prior of the latent variable `z`

under a spherical Gaussian distribution with zero mean and standard deviation `σ`

. The log-prior is computed using the formula for the log-prior of a Gaussian distribution.

**Note**

Ensure the dimension of `z`

matches the expected dimensionality of the latent space.

## Defining custom encoder and decoder types

We will omit all docstrings in the following examples for brevity. However, every struct and function in `AutoEncoderToolkit.jl`

is well-documented.

Let us imagine your particular task requires a custom encoder or decoder type. For example, let's imagine that for a particular application, you need a decoder whose output distribution is Poisson. In other words, the assumption is that each dimension in the input $x_i$ is a sample from a Poisson distribution with mean $\lambda_i$. Thus, on the decoder side, what the decoder return is a vector of these $\lambda$ paraeters. We thus need to define a custom decoder type.

```
struct PoissonDecoder <: AbstractVariationalDecoder
decoder::Flux.Chain
end # struct
```

With this struct defined, we need to define the forward-pass function for our custom `PoissonDecoder`

. All decoders in `AutoEncoderToolkit.jl`

return a `NamedTuple`

with the corresponding parameters of the distribution that defines them. In this case, the Poisson distribution is defined by a single parameter $\lambda$. Thus, we have a forward-pass of the form

```
function (decoder::PoissonDecoder)(z::AbstractArray)
# Run input to decoder network
return (λ=decoder.decoder(z),)
end # function
```

Next, we need to define the probabilistic function associated with this decoder. We know that the probability of observing $x_i$ given $\lambda_i$ is given by

\[P(x_i | \lambda_i) = \frac{\lambda_i^{x_i} e^{-\lambda_i}}{x_i!}. \tag{1}\]

If each $x_i$ is independent, then the probability of observing the entire input $x$ given the entire output $\lambda$ is given by the product of the individual probabilities, i.e.

\[P(x | \lambda) = \prod_i P(x_i | \lambda_i). \tag{2}\]

The log-likehood of the data given the output of the decoder is then given by

\[\mathcal{L}(x, \lambda) = \log P(x | \lambda) = \sum_i \log P(x_i | \lambda_i), \tag{3}\]

which, by using the properties of the logarithm, can be written as

\[\mathcal{L}(x, \lambda) = \sum_i x_i \log \lambda_i - \lambda_i - \log(x_i!). \tag{4}\]

We can then define the probabilistic function associated with the `PoissonDecoder`

as

```
function decoder_loglikelihood(
x::AbstractArray,
z::AbstractVector,
decoder::PoissonDecoder,
decoder_output::NamedTuple;
)
# Extract the lambda parameter of the Poisson distribution
λ = decoder_output.λ
# Compute log-likelihood
loglikelihood = sum(x .* log.(λ) - λ - loggamma.(x .+ 1))
return loglikelihood
end # function
```

where we use the `loggamma`

function from `SpecialFunctions.jl`

to compute the log of the factorial of `x_i`

.

We only defined the `decoder_loglikelihood`

method for `z::AbstractVector`

. One should also include a method for `z::AbstractMatrix`

used when performing batch training.

With these two functions defined, our `PoissonDecoder`

is ready to be used with any of the different VAE flavors included in `AutoEncoderToolkit.jl`

!