ExplicitFluxLayers.AdaptiveMaxPoolType
AdaptiveMaxPool(out::NTuple)

Adaptive Max Pooling layer. Calculates the necessary window size such that its output has size(y)[1:N] == out. Expects as input an array with ndims(x) == N+2, i.e. channel and batch dimensions, after the N feature dimensions, where N = length(out).

See also MaxPool, AdaptiveMeanPool.

ExplicitFluxLayers.AdaptiveMeanPoolType
AdaptiveMeanPool(out::NTuple)

Adaptive Mean Pooling layer. Calculates the necessary window size such that its output has size(y)[1:N] == out. Expects as input an array with ndims(x) == N+2, i.e. channel and batch dimensions, after the N feature dimensions, where N = length(out).

See also MaxPool, AdaptiveMaxPool.

ExplicitFluxLayers.BatchNormType
BatchNorm(chs::Integer, λ=identity; initβ=zeros32, initγ=ones32,
          affine = true, track_stats = true, ϵ=1f-5, momentum= 0.1f0)

Batch Normalization layer.

BatchNorm computes the mean and variance for each D_1×...×D_{N-2}×1×D_N input slice and normalises the input accordingly.

Arguments

  • chs should be the size of the channel dimension in your data (see below). Given an array with N dimensions, call the N-1th the channel dimension. For a batch of feature vectors this is just the data dimension, for WHCN images it's the usual channel dimension.
  • After normalisation, elementwise activation λ is applied.
  • If affine=true, it also applies a shift and a rescale to the input through to learnable per-channel bias β and scale γ parameters.
  • If track_stats=true, accumulates mean and var statistics in training phase that will be used to renormalize the input in test phase.

Use testmode during inference.

Examples

m = EFL.Chain(
        EFL.Dense(784 => 64),
        EFL.BatchNorm(64, relu),
        EFL.Dense(64 => 10),
        EFL.BatchNorm(10)
    )
ExplicitFluxLayers.BranchLayerType
BranchLayer(layers...)

Takes an input x and passes it through all the layers and returns a tuple of the outputs.

This is slightly different from Parallel(nothing, layers...) - If the input is a tuple Parallel will pass each element individually to each layer - BranchLayer essentially assumes 1 input comes in and is branched out into N outputs

An easy way to replicate an input to an NTuple is to do

l = EFL.BranchLayer(
    EFL.NoOpLayer(),
    EFL.NoOpLayer(),
    EFL.NoOpLayer(),
)
ExplicitFluxLayers.ChainType
Chain(layers...; disable_optimizations::Bool = false)

Collects multiple layers / functions to be called in sequence on a given input.

Performs a few optimizations to generate reasonable architectures. Can be disabled using keyword argument disable_optimizations.

  • All sublayers are recursively optimized.
  • If a function f is passed as a layer and it doesn't take 3 inputs, it is converted to a WrappedFunction(f) which takes only one input.
  • If the layer is a Chain, it is expanded out.
  • NoOpLayers are removed.
  • If there is only 1 layer (left after optimizations), then it is returned without the Chain wrapper.
  • If there are no layers (left after optimizations), a NoOpLayer is returned.
ExplicitFluxLayers.ConvType
Conv(filter, in => out, σ = identity; stride = 1, pad = 0, dilation = 1, groups = 1, [bias, initW])

Standard convolutional layer.

Arguments

  • filter is a tuple of integers specifying the size of the convolutional kernel
  • in and out specify the number of input and output channels.

Image data should be stored in WHCN order (width, height, channels, batch). In other words, a 100×100 RGB image would be a 100×100×3×1 array, and a batch of 50 would be a 100×100×3×50 array. This has N = 2 spatial dimensions, and needs a kernel size like (5,5), a 2-tuple of integers. To take convolutions along N feature dimensions, this layer expects as input an array with ndims(x) == N+2, where size(x, N+1) == in is the number of input channels, and size(x, ndims(x)) is (as always) the number of observations in a batch.

  • filter should be a tuple of N integers.
  • Keywords stride and dilation should each be either single integer, or a tuple with N integers.
  • Keyword pad specifies the number of elements added to the borders of the data array. It can be
    • a single integer for equal padding all around,
    • a tuple of N integers, to apply the same padding at begin/end of each spatial dimension,
    • a tuple of 2*N integers, for asymmetric padding, or
    • the singleton SamePad(), to calculate padding such that size(output,d) == size(x,d) / stride (possibly rounded) for each spatial dimension.
  • Keyword groups is expected to be an Int. It specifies the number of groups to divide a convolution into.

Keywords to control initialization of the layer:

  • initW - Function used to generate initial weights. Defaults to glorot_uniform.
  • bias - The initial bias vector is all zero by default. Trainable bias can be disabled entirely by setting this to false.
ExplicitFluxLayers.DenseType
Dense(in => out, σ=identity; initW=glorot_uniform, initb=zeros32, bias::Bool=true)

Create a traditional fully connected layer, whose forward pass is given by: y = σ.(weight * x .+ bias)

  • The input x should be a vector of length in, or batch of vectors represented as an in × N matrix, or any array with size(x,1) == in.
  • The output y will be a vector of length out, or a batch with size(y) == (out, size(x)[2:end]...)

Keyword bias=false will switch off trainable bias for the layer.

The initialisation of the weight matrix is W = initW(rng, out, in), calling the function given to keyword initW, with default glorot_uniform.

ExplicitFluxLayers.DropoutType
Dropout(p; dims=:)

Dropout layer.

Arguments

  • To apply dropout along certain dimension(s), specify the dims keyword. e.g. Dropout(p; dims = 3) will randomly zero out entire channels on WHCN input (also called 2D dropout).
  • Each execution of the Layer increments the seed and returns it wrapped in the state

Call testmode to switch to test mode.

ExplicitFluxLayers.GroupNormType
GroupNorm(chs::Integer, groups::Integer, λ=identity; initβ=zeros32, initγ=ones32,
          affine=true, track_stats=false, ϵ=1f-5, momentum=0.1f0)

Group Normalization layer.

Arguments

  • chs is the number of channels, the channel dimension of your input. For an array of N dimensions, the N-1th index is the channel dimension.
  • G is the number of groups along which the statistics are computed. The number of channels must be an integer multiple of the number of groups.
  • After normalisation, elementwise activation λ is applied.
  • If affine=true, it also applies a shift and a rescale to the input through to learnable per-channel bias β and scale γ parameters.
  • If track_stats=true, accumulates mean and var statistics in training phase that will be used to renormalize the input in test phase.
Warn

GroupNorm doesn't have CUDNN support. The GPU fallback is not very efficient.

ExplicitFluxLayers.MaxPoolType
MaxPool(window::NTuple; pad=0, stride=window)

Arguments

  • Max pooling layer, which replaces all pixels in a block of size window with one.
  • Expects as input an array with ndims(x) == N+2, i.e. channel and batch dimensions, after the N feature dimensions, where N = length(window).
  • By default the window size is also the stride in each dimension.
  • The keyword pad accepts the same options as for the Conv layer, including SamePad().

See also Conv, MeanPool, GlobalMaxPool.

ExplicitFluxLayers.MeanPoolType
MeanPool(window::NTuple; pad=0, stride=window)

Arguments

  • Mean pooling layer, which replaces all pixels in a block of size window with one.
  • Expects as input an array with ndims(x) == N+2, i.e. channel and batch dimensions, after the N feature dimensions, where N = length(window).
  • By default the window size is also the stride in each dimension.
  • The keyword pad accepts the same options as for the Conv layer, including SamePad().

See also Conv, MaxPool, GlobalMeanPool.

ExplicitFluxLayers.PairwiseFusionType
PairwiseFusion(connection, layers...)

Layer behaves differently based on input type:

  1. Input x is a tuple of length N then the layers must be a tuple of length N. The computation is as follows
y = x[1]
for i in 1:N
    y = connection(x[i], layers[i](y))
end
  1. Any other kind of input
y = x
for i in 1:N
    y = connection(x, layers[i](y))
end
ExplicitFluxLayers.ParallelType
Parallel(connection, layers...)

Behaves differently on different input types:

  • If x is a Tuple then each element is passed to each layer
  • Otherwise, x is directly passed to all layers
ExplicitFluxLayers.ScaleType
Scale(dims, σ=identity; initW=ones32, initb=zeros32, bias::Bool=true)

Create a Sparsely Connected Layer with a very specific structure (only Diagonal Elements are non-zero). The forward pass is given by: y = σ.(weight .* x .+ bias)

  • The input x should be a vector of length dims, or batch of vectors represented as an in × N matrix, or any array with size(x,1) == in.
  • The output y will be a vector of length dims, or a batch with size(y) == (dims, size(x)[2:end]...)

Keyword bias=false will switch off trainable bias for the layer.

The initialisation of the weight matrix is W = initW(rng, dims), calling the function given to keyword initW, with default glorot_uniform.

ExplicitFluxLayers.SkipConnectionType
SkipConnection(layer, connection)

Create a skip connection which consists of a layer or Chain of consecutive layers and a shortcut connection linking the block's input to the output through a user-supplied 2-argument callable. The first argument to the callable will be propagated through the given layer while the second is the unchanged, "skipped" input.

The simplest "ResNet"-type connection is just SkipConnection(layer, +).

ExplicitFluxLayers.UpsampleType
Upsample(mode = :nearest; [scale, size]) 
Upsample(scale, mode = :nearest)

An upsampling layer.

Arguments

One of two keywords must be given:

  • If scale is a number, this applies to all but the last two dimensions (channel and batch) of the input. It may also be a tuple, to control dimensions individually.
  • Alternatively, keyword size accepts a tuple, to directly specify the leading dimensions of the output.

Currently supported upsampling modes and corresponding NNlib's methods are:

ExplicitFluxLayers.VariationalHiddenDropoutType
VariationalHiddenDropout(p; dims=:)

VariationalHiddenDropout layer. The only difference from Dropout is that the mask is retained until EFL.update_state(l, :update_mask, true) is called.

Arguments

  • To apply dropout along certain dimension(s), specify the dims keyword. e.g. Dropout(p; dims = 3) will randomly zero out entire channels on WHCN input (also called 2D dropout).
  • Each execution of the Layer increments the seed and returns it wrapped in the state

Call testmode to switch to test mode.

ExplicitFluxLayers.WeightNormType
WeightNorm(layer::AbstractExplicitLayer, which_params::NTuple{N,Symbol}, dims::Union{Tuple,Nothing}=nothing)

Applies weight normalization to a parameter in the given layer.

``w = g\frac{v}{\|v\|}``

Weight normalization is a reparameterization that decouples the magnitude of a weight tensor from its direction. This updates the parameters in which_params (e.g. weight) using two parameters: one specifying the magnitude (e.g. weight_g) and one specifying the direction (e.g. weight_v).

By default, a norm over the entire array is computed. Pass dims to modify the dimension.

ExplicitFluxLayers.WrappedFunctionType
WrappedFunction(f)

Wraps a stateless and parameter less function. Might be used when a function is added to Chain. For example, Chain(x -> relu.(x)) would not work and the right thing to do would be Chain((x, ps, st) -> (relu.(x), st)). An easier thing to do would be Chain(WrappedFunction(Base.Fix1(broadcast, relu)))