`ExplicitFluxLayers.ActivationFunction`

— Type`ActivationFunction(f)`

Broadcast `f`

on the input but fallback to CUDNN for Backward Pass

`ExplicitFluxLayers.AdaptiveMaxPool`

— Type`AdaptiveMaxPool(out::NTuple)`

Adaptive Max Pooling layer. Calculates the necessary window size such that its output has `size(y)[1:N] == out`

. Expects as input an array with `ndims(x) == N+2`

, i.e. channel and batch dimensions, after the `N`

feature dimensions, where `N = length(out)`

.

See also `MaxPool`

, `AdaptiveMeanPool`

.

`ExplicitFluxLayers.AdaptiveMeanPool`

— Type`AdaptiveMeanPool(out::NTuple)`

Adaptive Mean Pooling layer. Calculates the necessary window size such that its output has `size(y)[1:N] == out`

. Expects as input an array with `ndims(x) == N+2`

, i.e. channel and batch dimensions, after the `N`

feature dimensions, where `N = length(out)`

.

See also `MaxPool`

, `AdaptiveMaxPool`

.

`ExplicitFluxLayers.BatchNorm`

— Type```
BatchNorm(chs::Integer, λ=identity; initβ=zeros32, initγ=ones32,
affine = true, track_stats = true, ϵ=1f-5, momentum= 0.1f0)
```

Batch Normalization layer.

`BatchNorm`

computes the mean and variance for each `D_1×...×D_{N-2}×1×D_N`

input slice and normalises the input accordingly.

**Arguments**

`chs`

should be the size of the channel dimension in your data (see below). Given an array with`N`

dimensions, call the`N-1`

th the channel dimension. For a batch of feature vectors this is just the data dimension, for`WHCN`

images it's the usual channel dimension.- After normalisation, elementwise activation
`λ`

is applied. - If
`affine=true`

, it also applies a shift and a rescale to the input through to learnable per-channel bias β and scale γ parameters. - If
`track_stats=true`

, accumulates mean and var statistics in training phase that will be used to renormalize the input in test phase.

Use `testmode`

during inference.

**Examples**

```
m = EFL.Chain(
EFL.Dense(784 => 64),
EFL.BatchNorm(64, relu),
EFL.Dense(64 => 10),
EFL.BatchNorm(10)
)
```

`ExplicitFluxLayers.BranchLayer`

— Type`BranchLayer(layers...)`

Takes an input `x`

and passes it through all the `layers`

and returns a tuple of the outputs.

This is slightly different from `Parallel(nothing, layers...)`

- If the input is a tuple Parallel will pass each element individually to each layer - `BranchLayer`

essentially assumes 1 input comes in and is branched out into `N`

outputs

An easy way to replicate an input to an NTuple is to do

```
l = EFL.BranchLayer(
EFL.NoOpLayer(),
EFL.NoOpLayer(),
EFL.NoOpLayer(),
)
```

`ExplicitFluxLayers.Chain`

— Type`Chain(layers...; disable_optimizations::Bool = false)`

Collects multiple layers / functions to be called in sequence on a given input.

Performs a few optimizations to generate reasonable architectures. Can be disabled using keyword argument `disable_optimizations`

.

- All sublayers are recursively optimized.
- If a function
`f`

is passed as a layer and it doesn't take 3 inputs, it is converted to a WrappedFunction(`f`

) which takes only one input. - If the layer is a Chain, it is expanded out.
`NoOpLayer`

s are removed.- If there is only 1 layer (left after optimizations), then it is returned without the
`Chain`

wrapper. - If there are no layers (left after optimizations), a
`NoOpLayer`

is returned.

`ExplicitFluxLayers.Conv`

— Type`Conv(filter, in => out, σ = identity; stride = 1, pad = 0, dilation = 1, groups = 1, [bias, initW])`

Standard convolutional layer.

**Arguments**

`filter`

is a tuple of integers specifying the size of the convolutional kernel`in`

and`out`

specify the number of input and output channels.

Image data should be stored in WHCN order (width, height, channels, batch). In other words, a 100×100 RGB image would be a `100×100×3×1`

array, and a batch of 50 would be a `100×100×3×50`

array. This has `N = 2`

spatial dimensions, and needs a kernel size like `(5,5)`

, a 2-tuple of integers. To take convolutions along `N`

feature dimensions, this layer expects as input an array with `ndims(x) == N+2`

, where `size(x, N+1) == in`

is the number of input channels, and `size(x, ndims(x))`

is (as always) the number of observations in a batch.

`filter`

should be a tuple of`N`

integers.- Keywords
`stride`

and`dilation`

should each be either single integer, or a tuple with`N`

integers. - Keyword
`pad`

specifies the number of elements added to the borders of the data array. It can be- a single integer for equal padding all around,
- a tuple of
`N`

integers, to apply the same padding at begin/end of each spatial dimension, - a tuple of
`2*N`

integers, for asymmetric padding, or - the singleton
`SamePad()`

, to calculate padding such that`size(output,d) == size(x,d) / stride`

(possibly rounded) for each spatial dimension.

- Keyword
`groups`

is expected to be an`Int`

. It specifies the number of groups to divide a convolution into.

Keywords to control initialization of the layer:

`initW`

- Function used to generate initial weights. Defaults to`glorot_uniform`

.`bias`

- The initial bias vector is all zero by default. Trainable bias can be disabled entirely by setting this to`false`

.

`ExplicitFluxLayers.Dense`

— Type`Dense(in => out, σ=identity; initW=glorot_uniform, initb=zeros32, bias::Bool=true)`

Create a traditional fully connected layer, whose forward pass is given by: `y = σ.(weight * x .+ bias)`

- The input
`x`

should be a vector of length`in`

, or batch of vectors represented as an`in × N`

matrix, or any array with`size(x,1) == in`

. - The output
`y`

will be a vector of length`out`

, or a batch with`size(y) == (out, size(x)[2:end]...)`

Keyword `bias=false`

will switch off trainable bias for the layer.

The initialisation of the weight matrix is `W = initW(rng, out, in)`

, calling the function given to keyword `initW`

, with default `glorot_uniform`

.

`ExplicitFluxLayers.Dropout`

— Type`Dropout(p; dims=:)`

Dropout layer.

**Arguments**

- To apply dropout along certain dimension(s), specify the
`dims`

keyword. e.g.`Dropout(p; dims = 3)`

will randomly zero out entire channels on WHCN input (also called 2D dropout). - Each execution of the Layer increments the
`seed`

and returns it wrapped in the state

Call `testmode`

to switch to test mode.

`ExplicitFluxLayers.FlattenLayer`

— Type`FlattenLayer()`

Flattens the passed array into a matrix.

`ExplicitFluxLayers.GlobalMaxPool`

— Type`GlobalMaxPool()`

Global Max Pooling layer. Transforms (w,h,c,b)-shaped input into (1,1,c,b)-shaped output, by performing max pooling on the complete (w,h)-shaped feature maps.

See also `MaxPool`

, `GlobalMeanPool`

.

`ExplicitFluxLayers.GlobalMeanPool`

— Type`GlobalMeanPool()`

Global Mean Pooling layer. Transforms (w,h,c,b)-shaped input into (1,1,c,b)-shaped output, by performing max pooling on the complete (w,h)-shaped feature maps.

See also `MeanPool`

, `GlobalMaxPool`

.

`ExplicitFluxLayers.GroupNorm`

— Type```
GroupNorm(chs::Integer, groups::Integer, λ=identity; initβ=zeros32, initγ=ones32,
affine=true, track_stats=false, ϵ=1f-5, momentum=0.1f0)
```

Group Normalization layer.

**Arguments**

`chs`

is the number of channels, the channel dimension of your input. For an array of N dimensions, the`N-1`

th index is the channel dimension.`G`

is the number of groups along which the statistics are computed. The number of channels must be an integer multiple of the number of groups.- After normalisation, elementwise activation
`λ`

is applied. - If
`affine=true`

, it also applies a shift and a rescale to the input through to learnable per-channel bias`β`

and scale`γ`

parameters. - If
`track_stats=true`

, accumulates mean and var statistics in training phase that will be used to renormalize the input in test phase.

GroupNorm doesn't have CUDNN support. The GPU fallback is not very efficient.

`ExplicitFluxLayers.MaxPool`

— Type`MaxPool(window::NTuple; pad=0, stride=window)`

**Arguments**

- Max pooling layer, which replaces all pixels in a block of size
`window`

with one. - Expects as input an array with
`ndims(x) == N+2`

, i.e. channel and batch dimensions, after the`N`

feature dimensions, where`N = length(window)`

. - By default the window size is also the stride in each dimension.
- The keyword
`pad`

accepts the same options as for the`Conv`

layer, including`SamePad()`

.

See also `Conv`

, `MeanPool`

, `GlobalMaxPool`

.

`ExplicitFluxLayers.MeanPool`

— Type`MeanPool(window::NTuple; pad=0, stride=window)`

**Arguments**

- Mean pooling layer, which replaces all pixels in a block of size
`window`

with one. - Expects as input an array with
`ndims(x) == N+2`

, i.e. channel and batch dimensions, after the`N`

feature dimensions, where`N = length(window)`

. - By default the window size is also the stride in each dimension.
- The keyword
`pad`

accepts the same options as for the`Conv`

layer, including`SamePad()`

.

See also `Conv`

, `MaxPool`

, `GlobalMeanPool`

.

`ExplicitFluxLayers.NoOpLayer`

— Type`NoOpLayer()`

As the name suggests does nothing but allows pretty printing of layers.

`ExplicitFluxLayers.PairwiseFusion`

— Type`PairwiseFusion(connection, layers...)`

Layer behaves differently based on input type:

- Input
`x`

is a tuple of length`N`

then the`layers`

must be a tuple of length`N`

. The computation is as follows

```
y = x[1]
for i in 1:N
y = connection(x[i], layers[i](y))
end
```

- Any other kind of input

```
y = x
for i in 1:N
y = connection(x, layers[i](y))
end
```

`ExplicitFluxLayers.Parallel`

— Type`Parallel(connection, layers...)`

Behaves differently on different input types:

- If
`x`

is a Tuple then each element is passed to each layer - Otherwise,
`x`

is directly passed to all layers

`ExplicitFluxLayers.ReshapeLayer`

— Type`ReshapeLayer(dims)`

Reshapes the passed array to have a size of `(dims..., :)`

`ExplicitFluxLayers.Scale`

— Type`Scale(dims, σ=identity; initW=ones32, initb=zeros32, bias::Bool=true)`

Create a Sparsely Connected Layer with a very specific structure (only Diagonal Elements are non-zero). The forward pass is given by: `y = σ.(weight .* x .+ bias)`

- The input
`x`

should be a vector of length`dims`

, or batch of vectors represented as an`in × N`

matrix, or any array with`size(x,1) == in`

. - The output
`y`

will be a vector of length`dims`

, or a batch with`size(y) == (dims, size(x)[2:end]...)`

Keyword `bias=false`

will switch off trainable bias for the layer.

The initialisation of the weight matrix is `W = initW(rng, dims)`

, calling the function given to keyword `initW`

, with default `glorot_uniform`

.

`ExplicitFluxLayers.SelectDim`

— Type`SelectDim(dim, i)`

See the documentation for `selectdim`

for more information.

`ExplicitFluxLayers.SkipConnection`

— Type`SkipConnection(layer, connection)`

Create a skip connection which consists of a layer or `Chain`

of consecutive layers and a shortcut connection linking the block's input to the output through a user-supplied 2-argument callable. The first argument to the callable will be propagated through the given `layer`

while the second is the unchanged, "skipped" input.

The simplest "ResNet"-type connection is just `SkipConnection(layer, +)`

.

`ExplicitFluxLayers.Upsample`

— Type```
Upsample(mode = :nearest; [scale, size])
Upsample(scale, mode = :nearest)
```

An upsampling layer.

**Arguments**

One of two keywords must be given:

- If
`scale`

is a number, this applies to all but the last two dimensions (channel and batch) of the input. It may also be a tuple, to control dimensions individually. - Alternatively, keyword
`size`

accepts a tuple, to directly specify the leading dimensions of the output.

Currently supported upsampling `mode`

s and corresponding NNlib's methods are:

`:nearest`

->`NNlib.upsample_nearest`

`:bilinear`

->`NNlib.upsample_bilinear`

`:trilinear`

->`NNlib.upsample_trilinear`

`ExplicitFluxLayers.VariationalHiddenDropout`

— Type`VariationalHiddenDropout(p; dims=:)`

VariationalHiddenDropout layer. The only difference from Dropout is that the `mask`

is retained until `EFL.update_state(l, :update_mask, true)`

is called.

**Arguments**

- To apply dropout along certain dimension(s), specify the
`dims`

keyword. e.g.`Dropout(p; dims = 3)`

will randomly zero out entire channels on WHCN input (also called 2D dropout). - Each execution of the Layer increments the
`seed`

and returns it wrapped in the state

Call `testmode`

to switch to test mode.

`ExplicitFluxLayers.WeightNorm`

— Type`WeightNorm(layer::AbstractExplicitLayer, which_params::NTuple{N,Symbol}, dims::Union{Tuple,Nothing}=nothing)`

Applies weight normalization to a parameter in the given layer.

```w = g\frac{v}{\|v\|}```

Weight normalization is a reparameterization that decouples the magnitude of a weight tensor from its direction. This updates the parameters in `which_params`

(e.g. `weight`

) using two parameters: one specifying the magnitude (e.g. `weight_g`

) and one specifying the direction (e.g. `weight_v`

).

By default, a norm over the entire array is computed. Pass `dims`

to modify the dimension.

`ExplicitFluxLayers.WrappedFunction`

— Type`WrappedFunction(f)`

Wraps a stateless and parameter less function. Might be used when a function is added to Chain. For example, `Chain(x -> relu.(x))`

would not work and the right thing to do would be `Chain((x, ps, st) -> (relu.(x), st))`

. An easier thing to do would be `Chain(WrappedFunction(Base.Fix1(broadcast, relu)))`