# API

## Transforms

### Abstract Transform Types

`FeatureTransforms.Transform`

— Type`Transform`

Abstract supertype for all feature Transforms.

`FeatureTransforms.AbstractScaling`

— Type`AbstractScaling <: Transform`

Linearly scale the data according to some statistics.

### Implemented Transforms

`FeatureTransforms.HoD`

— Type`HoD <: Transform`

Get the hour of day corresponding to the data.

`FeatureTransforms.Power`

— Type`Power(exponent) <: Transform`

Raise the data by the given `exponent`

.

`FeatureTransforms.Periodic`

— Type`Periodic{P, S}(f, period::P, [phase_shift::S]) <: Transform`

Applies a periodic function `f`

with provided `period`

and `phase_shift`

to the data.

The `period`

and `phase_shift`

must have the same supertype of `Real`

or `Period`

, depending on whether the data is `Real`

or `TimeType`

respectively.

For `TimeType`

data, the result will change depending on the type of `period`

given, even if the same amount of time is described. Example: `Week(1)`

vs `Second(Week(1))`

; the former starts the period on the most recent Monday, while the latter starts the period on the most recent multiple of 604800 seconds since time 0.

**Fields**

`f::Union{typeof(cos), typeof(sin)}`

: the periodic function`period::Union{Real, Period}`

: the function period. Must be strictly positive.`phase_shift::Union{Real, Period}`

(optional): adjusts the phase of the periodic function, measured in the same units as the input. Increasing the value translates the function to the right, toward higher/later input values.

`FeatureTransforms.StandardScaling`

— Type`StandardScaling <: AbstractScaling`

Transforms the data according to

`x -> (x - μ) / σ`

where μ and σ are the mean and standard deviation of the training data.

`fit!(scaling, data)`

needs to be called before the transform can be `apply`

ed. By default *all the data* is considered when `fit!`

ing the mean and standard deviation.

`FeatureTransforms.IdentityScaling`

— Type`IdentityScaling <: AbstractScaling`

Represents the no-op scaling which simply returns the `data`

it is applied on.

`FeatureTransforms.InverseHyperbolicSine`

— Type`InverseHyperbolicSine <: Transform`

Logarithmically transform the data through: log(x + √(x² + 1)).

This is the inverse hyperbolic sine.

`FeatureTransforms.LinearCombination`

— Type`LinearCombination(coefficients) <: Transform`

Calculates the linear combination of a collection of terms weighted by some `coefficients`

.

When applied to an N-dimensional array, `LinearCombination`

reduces along the `dim`

provided and returns an (N-1)-dimensional array.

If no `inds`

are specified, then the transform is applied to all elements.

!!!note The current default is that `dims=1`

but this behaviour will be deprecated in a future release and the `dims`

keyword argument will have to be specified explicitly. https://github.com/invenia/FeatureTransforms.jl/issues/82

`FeatureTransforms.LogTransform`

— Type`LogTransform <: Transform`

Logarithmically transform the data through: sign(x) * log(|x| + 1).

This allows transformations of all real numbers, not just positive ones.

`FeatureTransforms.OneHotEncoding`

— Type`OneHotEncoding{R<:Real} <: Transform`

One-hot encode the categorical value for each target element.

Construct a n-by-p binary matrix, given a `Vector`

of target data `x`

(of length n) and a `Vector`

of all unique possible values in x (of length p).

The element [i, j] is `true`

if the i^th target in `x`

corresponds to the j^th possible value and `false`

otherwise. Note that `R`

can be specified to determine the return type of results. It defaults to a `Matrix`

of `Bool`

s.

Note that this Transform does not support specifying dims other than `:`

(all dims) because it is a one-to-many transform (for example a `Vector`

input produces a `Matrix`

output).

Note that `OneHotEncoding`

needs to be first encoded with the expected categories before it can be used. This is because the data might be missing certain categories which will lead to incomplete classification.

## Applying Transforms

`FeatureTransforms.apply`

— Function`apply(data::T, ::Transform; kwargs...)`

Applies the `Transform`

to the data. New transforms should usually only extend `_apply`

which this method delegates to.

Where necessary, this should be extended for new data types `T`

.

`FeatureTransforms.apply!`

— Function`FeatureTransforms.apply_append`

— Function## Transform Interface

`FeatureTransforms.is_transformable`

— Function`FeatureTransforms.transform!`

— Function`transform!(::T, data)`

Mutating version of `transform`

.

`FeatureTransforms.transform`

— Function`transform(::T, data)`

Defines the feature engineering pipeline for some type `T`

, which comprises a collection of `Transform`

s and other steps to be peformed on the `data`

.

The idea around a "transform interface” is to make feature transformations composable, i.e. the output of any one `Transform`

should be valid input to another.

Feature engineering pipelines should obey the same principle and it should be trivial to add/remove `Transform`

steps that compose the pipeline without it breaking.

`transform`

should be overloaded for custom types `T`

that require feature engineering. The only requirement is that the return of `transform`

is itself "transformable", i.e. calling `is_transformable`

on the output returns true.

## Deprecated funtionality

`FeatureTransforms.MeanStdScaling`

— Type`MeanStdScaling(μ, σ) <: AbstractScaling`

Linearly scale the data by the statistical mean `μ`

and standard deviation `σ`

. This is also known as standardization, or the Z score transform.

**Keyword arguments to apply**

`inverse=true`

: inverts the scaling (e.g. to reconstruct the unscaled data).`eps=1e-3`

: used in place of all 0 values in`σ`

before scaling (if`inverse=false`

).