Lathe.preprocess
— Module|====== Lathe.preprocess =====
|____________/ Generalized Processing ___________
|_____preprocess.TrainTestSplit
|_____preprocess.SortSplit
|_____preprocess.UniformSplit
|____________/ Feature Scaling ___________
|_____preprocess.Rescalar
|_____preprocess.ArbitraryRescale
|_____preprocess.MeanNormalization
|_____preprocess.StandardScalar
|____________/ Categorical Encoding ___________
|_____preprocess.OneHotEncoder
|_____preprocess.OrdinalEncoder
|_____preprocess.FloatEncoder
Lathe.preprocess.ArbitraryRescaler
— TypeArbitrary Rescaler
Description
Arbitrarily rescales an array.
Input
ArbitraryRescaler(x)
Positional Arguments
Array{Any} - x:: Array for which the original scaler should be based off of.
Output
scalar :: A Lathe Preprocesser object.
Functions
Preprocesser.predict(xt) :: Applies the scaler to xt.
Data
a :: The minimum value in the array.
b :: The maximum value in the array.
Lathe.preprocess.FloatEncoder
— TypeFloat Encoder
Description
Float/Label Encodes an array.
Input
OneHotEncoder()
Output
encoder :: A Lathe Preprocesser object.
Functions
Preprocesser.predict(xt) :: Returns an ordinally encoded xt.
Lathe.preprocess.MeanScaler
— TypeMean Normalizer
Description
Normalizes an array using the mean of the data.
Input
ArbitraryRescaler(x)
Positional Arguments
Array{Any} - x:: Array for which the original scaler should be based off of.
Output
scalar :: A Lathe Preprocesser object.
Functions
Preprocesser.predict(xt) :: Applies the scaler to xt.
Data
a :: The minimum value in the array.
b :: The maximum value in the array.
avg :: The mean of the array.
Lathe.preprocess.OneHotEncoder
— TypeOneHotEncoder
Description
One Hot Encodes a dataframe column into a dataframe.
Input
OneHotEncoder()
Output
encoder :: A Lathe Preprocesser object.
Functions
Preprocesser.predict(df, symb) :: Applies the encoder to the dataframe key corresponding with symb on DF, then returns a dataframe with encoded results.
Lathe.preprocess.OrdinalEncoder
— TypeOrdinal Encoder
Description
Ordinally Encodes an array.
Input
OrdinalEncoder(x)
Positional Arguments
Array{Any} - x:: Array for which the original scaler should be based off of.
Output
encoder :: A Lathe Preprocesser object.
Functions
Preprocesser.predict(xt) :: Returns an ordinally encoded xt.
Lathe.preprocess.Rescaler
— TypeRescalar
Description
Rescales an array.
Input
Rescaler(x)
Positional Arguments
Array{Any} - x:: Array for which the original scaler should be based off of.
Output
scalar :: A Lathe Preprocesser object.
Functions
Preprocesser.predict(xt) :: Applies the scaler to xt.
Data
min :: The minimum value in the array.
max :: The maximum value in the array.
Lathe.preprocess.StandardScaler
— TypeStandard Scaler
Description
Normalizes an array using the z (Normal) distribution.
Input
StandardScaler(x)
Positional Arguments
Array{Any} - x:: Array for which the original scaler should be based off of.
Output
scalar :: A Lathe Preprocesser object.
Functions
Preprocesser.predict(xt) :: Applies the scaler to xt.
Data
dist :: Returns the normal distribution object for which this scaler uses.
Lathe.preprocess.SortSplit
— FunctionSort Split
Description
Sorts an array, and then splits said array.
Input
SortSplit(x, .75, false)
Positional Arguments
Array{Any} - data:: The data to split.
Float64 - at:: A percentage that determines where the data is split.
Bool - rev:: Determines whether the order of the sort should be reversed.
Output
train:: The larger half of the split set.
test:: The smaller half of the split set.
Lathe.preprocess.TrainTestSplit
— FunctionTrainTestSplit
Description
Splits an array or dataframe into two smaller groups based on the percentage provided in the at parameter.
Input
TrainTestSplit(x, .75)
Positional Arguments
Array{Any}, DataFrame - data:: The data to split.
Float64 - at:: A percentage that determines where the data is split.
Output
train:: The larger half of the split set.
test:: The smaller half of the split set.
Lathe.preprocess.UniformSplit
— FunctionUniform Split
Description
Uniform Split will split an array without shuffling the data first.
Input
UniformSplit(x, .75)
Positional Arguments
Array{Any} - data:: The data to split.
Float64 - at:: A percentage that determines where the data is split.
Output
train:: The larger half of the split set.
test:: The smaller half of the split set.
Lathe.preprocess.@norm
— MacroReturns the normal distribution of an array.
x = [5,10,15,20]
norm = @norm x
Lathe.preprocess.@onehot
— MacroOneHotEncodes a dataframe
Takes a symbol representing the column to one hot encode from and a DF.
df = (:A => ["hello","world"], :B => ["Foo", "Bar"])
encoded = @onehot df, :A
Lathe.preprocess.@tts
— MacroTrainTestSplits an Array
x = [5,10,15,20]
train, test = @tts x