ExplicitFluxLayers.ActivationFunction
— TypeActivationFunction(f)
Broadcast f
on the input but fallback to CUDNN for Backward Pass
ExplicitFluxLayers.AdaptiveMaxPool
— TypeAdaptiveMaxPool(out::NTuple)
Adaptive Max Pooling layer. Calculates the necessary window size such that its output has size(y)[1:N] == out
. Expects as input an array with ndims(x) == N+2
, i.e. channel and batch dimensions, after the N
feature dimensions, where N = length(out)
.
See also MaxPool
, AdaptiveMeanPool
.
ExplicitFluxLayers.AdaptiveMeanPool
— TypeAdaptiveMeanPool(out::NTuple)
Adaptive Mean Pooling layer. Calculates the necessary window size such that its output has size(y)[1:N] == out
. Expects as input an array with ndims(x) == N+2
, i.e. channel and batch dimensions, after the N
feature dimensions, where N = length(out)
.
See also MaxPool
, AdaptiveMaxPool
.
ExplicitFluxLayers.BatchNorm
— TypeBatchNorm(chs::Integer, λ=identity; initβ=zeros32, initγ=ones32,
affine = true, track_stats = true, ϵ=1f-5, momentum= 0.1f0)
Batch Normalization layer.
BatchNorm
computes the mean and variance for each D_1×...×D_{N-2}×1×D_N
input slice and normalises the input accordingly.
Arguments
chs
should be the size of the channel dimension in your data (see below). Given an array withN
dimensions, call theN-1
th the channel dimension. For a batch of feature vectors this is just the data dimension, forWHCN
images it's the usual channel dimension.- After normalisation, elementwise activation
λ
is applied. - If
affine=true
, it also applies a shift and a rescale to the input through to learnable per-channel bias β and scale γ parameters. - If
track_stats=true
, accumulates mean and var statistics in training phase that will be used to renormalize the input in test phase.
Use testmode
during inference.
Examples
m = EFL.Chain(
EFL.Dense(784 => 64),
EFL.BatchNorm(64, relu),
EFL.Dense(64 => 10),
EFL.BatchNorm(10)
)
ExplicitFluxLayers.BranchLayer
— TypeBranchLayer(layers...)
Takes an input x
and passes it through all the layers
and returns a tuple of the outputs.
This is slightly different from Parallel(nothing, layers...)
- If the input is a tuple Parallel will pass each element individually to each layer - BranchLayer
essentially assumes 1 input comes in and is branched out into N
outputs
An easy way to replicate an input to an NTuple is to do
l = EFL.BranchLayer(
EFL.NoOpLayer(),
EFL.NoOpLayer(),
EFL.NoOpLayer(),
)
ExplicitFluxLayers.Chain
— TypeChain(layers...; disable_optimizations::Bool = false)
Collects multiple layers / functions to be called in sequence on a given input.
Performs a few optimizations to generate reasonable architectures. Can be disabled using keyword argument disable_optimizations
.
- All sublayers are recursively optimized.
- If a function
f
is passed as a layer and it doesn't take 3 inputs, it is converted to a WrappedFunction(f
) which takes only one input. - If the layer is a Chain, it is expanded out.
NoOpLayer
s are removed.- If there is only 1 layer (left after optimizations), then it is returned without the
Chain
wrapper. - If there are no layers (left after optimizations), a
NoOpLayer
is returned.
ExplicitFluxLayers.Conv
— TypeConv(filter, in => out, σ = identity; stride = 1, pad = 0, dilation = 1, groups = 1, [bias, initW])
Standard convolutional layer.
Arguments
filter
is a tuple of integers specifying the size of the convolutional kernelin
andout
specify the number of input and output channels.
Image data should be stored in WHCN order (width, height, channels, batch). In other words, a 100×100 RGB image would be a 100×100×3×1
array, and a batch of 50 would be a 100×100×3×50
array. This has N = 2
spatial dimensions, and needs a kernel size like (5,5)
, a 2-tuple of integers. To take convolutions along N
feature dimensions, this layer expects as input an array with ndims(x) == N+2
, where size(x, N+1) == in
is the number of input channels, and size(x, ndims(x))
is (as always) the number of observations in a batch.
filter
should be a tuple ofN
integers.- Keywords
stride
anddilation
should each be either single integer, or a tuple withN
integers. - Keyword
pad
specifies the number of elements added to the borders of the data array. It can be- a single integer for equal padding all around,
- a tuple of
N
integers, to apply the same padding at begin/end of each spatial dimension, - a tuple of
2*N
integers, for asymmetric padding, or - the singleton
SamePad()
, to calculate padding such thatsize(output,d) == size(x,d) / stride
(possibly rounded) for each spatial dimension.
- Keyword
groups
is expected to be anInt
. It specifies the number of groups to divide a convolution into.
Keywords to control initialization of the layer:
initW
- Function used to generate initial weights. Defaults toglorot_uniform
.bias
- The initial bias vector is all zero by default. Trainable bias can be disabled entirely by setting this tofalse
.
ExplicitFluxLayers.Dense
— TypeDense(in => out, σ=identity; initW=glorot_uniform, initb=zeros32, bias::Bool=true)
Create a traditional fully connected layer, whose forward pass is given by: y = σ.(weight * x .+ bias)
- The input
x
should be a vector of lengthin
, or batch of vectors represented as anin × N
matrix, or any array withsize(x,1) == in
. - The output
y
will be a vector of lengthout
, or a batch withsize(y) == (out, size(x)[2:end]...)
Keyword bias=false
will switch off trainable bias for the layer.
The initialisation of the weight matrix is W = initW(rng, out, in)
, calling the function given to keyword initW
, with default glorot_uniform
.
ExplicitFluxLayers.Dropout
— TypeDropout(p; dims=:)
Dropout layer.
Arguments
- To apply dropout along certain dimension(s), specify the
dims
keyword. e.g.Dropout(p; dims = 3)
will randomly zero out entire channels on WHCN input (also called 2D dropout). - Each execution of the Layer increments the
seed
and returns it wrapped in the state
Call testmode
to switch to test mode.
ExplicitFluxLayers.FlattenLayer
— TypeFlattenLayer()
Flattens the passed array into a matrix.
ExplicitFluxLayers.GlobalMaxPool
— TypeGlobalMaxPool()
Global Max Pooling layer. Transforms (w,h,c,b)-shaped input into (1,1,c,b)-shaped output, by performing max pooling on the complete (w,h)-shaped feature maps.
See also MaxPool
, GlobalMeanPool
.
ExplicitFluxLayers.GlobalMeanPool
— TypeGlobalMeanPool()
Global Mean Pooling layer. Transforms (w,h,c,b)-shaped input into (1,1,c,b)-shaped output, by performing max pooling on the complete (w,h)-shaped feature maps.
See also MeanPool
, GlobalMaxPool
.
ExplicitFluxLayers.GroupNorm
— TypeGroupNorm(chs::Integer, groups::Integer, λ=identity; initβ=zeros32, initγ=ones32,
affine=true, track_stats=false, ϵ=1f-5, momentum=0.1f0)
Group Normalization layer.
Arguments
chs
is the number of channels, the channel dimension of your input. For an array of N dimensions, theN-1
th index is the channel dimension.G
is the number of groups along which the statistics are computed. The number of channels must be an integer multiple of the number of groups.- After normalisation, elementwise activation
λ
is applied. - If
affine=true
, it also applies a shift and a rescale to the input through to learnable per-channel biasβ
and scaleγ
parameters. - If
track_stats=true
, accumulates mean and var statistics in training phase that will be used to renormalize the input in test phase.
GroupNorm doesn't have CUDNN support. The GPU fallback is not very efficient.
ExplicitFluxLayers.MaxPool
— TypeMaxPool(window::NTuple; pad=0, stride=window)
Arguments
- Max pooling layer, which replaces all pixels in a block of size
window
with one. - Expects as input an array with
ndims(x) == N+2
, i.e. channel and batch dimensions, after theN
feature dimensions, whereN = length(window)
. - By default the window size is also the stride in each dimension.
- The keyword
pad
accepts the same options as for theConv
layer, includingSamePad()
.
See also Conv
, MeanPool
, GlobalMaxPool
.
ExplicitFluxLayers.MeanPool
— TypeMeanPool(window::NTuple; pad=0, stride=window)
Arguments
- Mean pooling layer, which replaces all pixels in a block of size
window
with one. - Expects as input an array with
ndims(x) == N+2
, i.e. channel and batch dimensions, after theN
feature dimensions, whereN = length(window)
. - By default the window size is also the stride in each dimension.
- The keyword
pad
accepts the same options as for theConv
layer, includingSamePad()
.
See also Conv
, MaxPool
, GlobalMeanPool
.
ExplicitFluxLayers.NoOpLayer
— TypeNoOpLayer()
As the name suggests does nothing but allows pretty printing of layers.
ExplicitFluxLayers.PairwiseFusion
— TypePairwiseFusion(connection, layers...)
Layer behaves differently based on input type:
- Input
x
is a tuple of lengthN
then thelayers
must be a tuple of lengthN
. The computation is as follows
y = x[1]
for i in 1:N
y = connection(x[i], layers[i](y))
end
- Any other kind of input
y = x
for i in 1:N
y = connection(x, layers[i](y))
end
ExplicitFluxLayers.Parallel
— TypeParallel(connection, layers...)
Behaves differently on different input types:
- If
x
is a Tuple then each element is passed to each layer - Otherwise,
x
is directly passed to all layers
ExplicitFluxLayers.ReshapeLayer
— TypeReshapeLayer(dims)
Reshapes the passed array to have a size of (dims..., :)
ExplicitFluxLayers.Scale
— TypeScale(dims, σ=identity; initW=ones32, initb=zeros32, bias::Bool=true)
Create a Sparsely Connected Layer with a very specific structure (only Diagonal Elements are non-zero). The forward pass is given by: y = σ.(weight .* x .+ bias)
- The input
x
should be a vector of lengthdims
, or batch of vectors represented as anin × N
matrix, or any array withsize(x,1) == in
. - The output
y
will be a vector of lengthdims
, or a batch withsize(y) == (dims, size(x)[2:end]...)
Keyword bias=false
will switch off trainable bias for the layer.
The initialisation of the weight matrix is W = initW(rng, dims)
, calling the function given to keyword initW
, with default glorot_uniform
.
ExplicitFluxLayers.SelectDim
— TypeSelectDim(dim, i)
See the documentation for selectdim
for more information.
ExplicitFluxLayers.SkipConnection
— TypeSkipConnection(layer, connection)
Create a skip connection which consists of a layer or Chain
of consecutive layers and a shortcut connection linking the block's input to the output through a user-supplied 2-argument callable. The first argument to the callable will be propagated through the given layer
while the second is the unchanged, "skipped" input.
The simplest "ResNet"-type connection is just SkipConnection(layer, +)
.
ExplicitFluxLayers.Upsample
— TypeUpsample(mode = :nearest; [scale, size])
Upsample(scale, mode = :nearest)
An upsampling layer.
Arguments
One of two keywords must be given:
- If
scale
is a number, this applies to all but the last two dimensions (channel and batch) of the input. It may also be a tuple, to control dimensions individually. - Alternatively, keyword
size
accepts a tuple, to directly specify the leading dimensions of the output.
Currently supported upsampling mode
s and corresponding NNlib's methods are:
:nearest
->NNlib.upsample_nearest
:bilinear
->NNlib.upsample_bilinear
:trilinear
->NNlib.upsample_trilinear
ExplicitFluxLayers.VariationalHiddenDropout
— TypeVariationalHiddenDropout(p; dims=:)
VariationalHiddenDropout layer. The only difference from Dropout is that the mask
is retained until EFL.update_state(l, :update_mask, true)
is called.
Arguments
- To apply dropout along certain dimension(s), specify the
dims
keyword. e.g.Dropout(p; dims = 3)
will randomly zero out entire channels on WHCN input (also called 2D dropout). - Each execution of the Layer increments the
seed
and returns it wrapped in the state
Call testmode
to switch to test mode.
ExplicitFluxLayers.WeightNorm
— TypeWeightNorm(layer::AbstractExplicitLayer, which_params::NTuple{N,Symbol}, dims::Union{Tuple,Nothing}=nothing)
Applies weight normalization to a parameter in the given layer.
``w = g\frac{v}{\|v\|}``
Weight normalization is a reparameterization that decouples the magnitude of a weight tensor from its direction. This updates the parameters in which_params
(e.g. weight
) using two parameters: one specifying the magnitude (e.g. weight_g
) and one specifying the direction (e.g. weight_v
).
By default, a norm over the entire array is computed. Pass dims
to modify the dimension.
ExplicitFluxLayers.WrappedFunction
— TypeWrappedFunction(f)
Wraps a stateless and parameter less function. Might be used when a function is added to Chain. For example, Chain(x -> relu.(x))
would not work and the right thing to do would be Chain((x, ps, st) -> (relu.(x), st))
. An easier thing to do would be Chain(WrappedFunction(Base.Fix1(broadcast, relu)))