The BetaML.Utils Module
BetaML.Utils
— ModuleUtils module
Provide shared utility functions for various machine learning algorithms. You don't usually need to import from this module, as each other module (Nn, Perceptron, Clusters,...) reexport it.
Module Index
BetaML.Utils.accuracy
BetaML.Utils.accuracy
BetaML.Utils.accuracy
BetaML.Utils.accuracy
BetaML.Utils.accuracy
BetaML.Utils.aic
BetaML.Utils.autoJacobian
BetaML.Utils.batch
BetaML.Utils.bic
BetaML.Utils.celu
BetaML.Utils.classCounts
BetaML.Utils.cosine_distance
BetaML.Utils.crossEntropy
BetaML.Utils.dcelu
BetaML.Utils.delu
BetaML.Utils.dmish
BetaML.Utils.dplu
BetaML.Utils.drelu
BetaML.Utils.dsigmoid
BetaML.Utils.dsoftmax
BetaML.Utils.dsoftplus
BetaML.Utils.dtanh
BetaML.Utils.elu
BetaML.Utils.entropy
BetaML.Utils.getScaleFactors
BetaML.Utils.giniImpurity
BetaML.Utils.l1_distance
BetaML.Utils.l2_distance
BetaML.Utils.l2²_distance
BetaML.Utils.lse
BetaML.Utils.makeMatrix
BetaML.Utils.meanDicts
BetaML.Utils.meanRelError
BetaML.Utils.mish
BetaML.Utils.oneHotEncoder
BetaML.Utils.pca
BetaML.Utils.plu
BetaML.Utils.polynomialKernel
BetaML.Utils.radialKernel
BetaML.Utils.relu
BetaML.Utils.scale
BetaML.Utils.sigmoid
BetaML.Utils.softmax
BetaML.Utils.softplus
BetaML.Utils.squaredCost
BetaML.Utils.sterling
Detailed API
Base.error
— Methoderror(ŷ,y) - Categorical error (Int vs Int)
Base.error
— Methoderror(ŷ,y) - Categorical error with probabilistic prediction of a single datapoint (PMF vs Int).
Base.error
— Methoderror(ŷ,y) - Categorical error with probabilistic predictions of a dataset (PMF vs Int).
Base.reshape
— Methodreshape(myNumber, dims..) - Reshape a number as a n dimensional Array
BetaML.Utils.accuracy
— Methodaccuracy(ŷ,y) - Categorical accuracy (Int vs Int)
BetaML.Utils.accuracy
— Methodaccuracy(ŷ,y;tol)
Categorical accuracy with probabilistic predictions of a dataset given in terms of a dictionary of probabilities (Dict{T,Float64} vs T).
Parameters:
ŷ
: A narray where each item is the estimated probability mass function in terms of a Dictionary(Item1 => Prob1, Item2 => Prob2, ...)y
: The N array with the correct category for each point $n$.tol
: The tollerance to the prediction, i.e. if considering "correct" only a prediction where the value with highest probability is the true value (tol
= 1), or consider instead the set oftol
maximum values [def:1
].
BetaML.Utils.accuracy
— Methodaccuracy(ŷ,y;tol)
Categorical accuracy with probabilistic prediction of a single datapoint (PMF vs Int).
Use the parameter tol [def: 1
] to determine the tollerance of the prediction, i.e. if considering "correct" only a prediction where the value with highest probability is the true value (tol
= 1), or consider instead the set of tol
maximum values.
BetaML.Utils.accuracy
— Methodaccuracy(ŷ,y;tol,ignoreLabels)
Categorical accuracy with probabilistic predictions of a dataset (PMF vs Int).
Parameters:
ŷ
: An (N,K) matrix of probabilities that each $\hat y_n$ record with $n \in 1,....,N$ being of category $k$ with $k \in 1,...,K$.y
: The N array with the correct category for each point $n$.tol
: The tollerance to the prediction, i.e. if considering "correct" only a prediction where the value with highest probability is the true value (tol
= 1), or consider instead the set oftol
maximum values [def:1
].ignoreLabels
: Wheter to ignore the specific label order in y. Useful for unsupervised learning algorithms where the specific label order don't make sense [def: false]
BetaML.Utils.accuracy
— Methodaccuracy(ŷ,y;tol)
Categorical accuracy with probabilistic prediction of a single datapoint given in terms of a dictionary of probabilities (Dict{T,Float64} vs T).
Parameters:
ŷ
: The returned probability mass function in terms of a Dictionary(Item1 => Prob1, Item2 => Prob2, ...)tol
: The tollerance to the prediction, i.e. if considering "correct" only a prediction where the value with highest probability is the true value (tol
= 1), or consider instead the set oftol
maximum values [def:1
].
BetaML.Utils.aic
— Methodaic(lL,k) - Akaike information criterion (lower is better)
BetaML.Utils.autoJacobian
— MethodautoJacobian(f,x;nY)
Evaluate the Jacobian using AD in the form of a (nY,nX) madrix of first derivatives
Parameters:
f
: The function to compute the Jacobianx
: The input to the function where the jacobian has to be computednY
: The number of outputs of the functionf
[def:length(f(x))
]
Return values:
- An
Array{Float64,2}
of the locally evaluated Jacobian
Notes:
- The
nY
parameter is optional. If provided it avoids having to computef(x)
BetaML.Utils.batch
— Methodbatch(n,bSize;sequential=false)
Return a vector of bSize
vectors of indeces from 1
to n
. Randomly unless the optional parameter sequential
is used.
Example:
julia julia> Utils.batch(6,2,sequential=true) 3-element Array{Array{Int64,1},1}: [1, 2] [3, 4] [5, 6]
BetaML.Utils.bic
— Methodbic(lL,k,n) - Bayesian information criterion (lower is better)
BetaML.Utils.celu
— Methodcelu(x; α=1)
https://arxiv.org/pdf/1704.07483.pdf
BetaML.Utils.classCounts
— MethodclassCounts(x)
Return a dictionary that counts the number of each unique item (rows) in a dataset.
BetaML.Utils.cosine_distance
— MethodCosine distance
BetaML.Utils.crossEntropy
— MethodcrossEntropy(ŷ, y; weight)
Compute the (weighted) cross-entropy between the predicted and the sampled probability distributions.
To be used in classification problems.
BetaML.Utils.dcelu
— Methoddcelu(x; α=1)
https://arxiv.org/pdf/1704.07483.pdf
BetaML.Utils.delu
— Methoddelu(x; α=1) with α > 0
https://arxiv.org/pdf/1511.07289.pdf
BetaML.Utils.dmish
— Methoddmish(x)
https://arxiv.org/pdf/1908.08681v1.pdf
BetaML.Utils.dplu
— Methoddplu(x;α=0.1,c=1)
Piecewise Linear Unit derivative
https://arxiv.org/pdf/1809.09534.pdf
BetaML.Utils.drelu
— Methoddrelu(x)
Rectified Linear Unit
https://www.cs.toronto.edu/~hinton/absps/reluICML.pdf
BetaML.Utils.dsigmoid
— Methoddsigmoid(x)
BetaML.Utils.dsoftmax
— Methoddsoftmax(x; β=1)
Derivative of the softmax function
https://eli.thegreenplace.net/2016/the-softmax-function-and-its-derivative/
BetaML.Utils.dsoftplus
— Methoddsoftplus(x)
https://en.wikipedia.org/wiki/Rectifier(neuralnetworks)#Softplus
BetaML.Utils.dtanh
— Methoddtanh(x)
BetaML.Utils.elu
— Methodelu(x; α=1) with α > 0
https://arxiv.org/pdf/1511.07289.pdf
BetaML.Utils.entropy
— Methodentropy(x)
Calculate the entropy for a list of items (or rows).
See: https://en.wikipedia.org/wiki/Decisiontreelearning#Gini_impurity
BetaML.Utils.getScaleFactors
— MethodgetScaleFactors(x;skip)
Return the scale factors (for each dimensions) in order to scale a matrix X (n,d) such that each dimension has mean 0 and variance 1.
Parameters
x
: the (n × d) dimension matrix to scale on each dimension dskip
: an array of dimension index to skip the scaling [def:[]
]
Return
- A touple whose first elmement is the shift and the second the multiplicative
term to make the scale.
BetaML.Utils.giniImpurity
— MethodginiImpurity(x)
Calculate the Gini Impurity for a list of items (or rows).
See: https://en.wikipedia.org/wiki/Decisiontreelearning#Information_gain
BetaML.Utils.l1_distance
— MethodL1 norm distance (aka Manhattan Distance)
BetaML.Utils.l2_distance
— MethodEuclidean (L2) distance
BetaML.Utils.l2²_distance
— MethodSquared Euclidean (L2) distance
BetaML.Utils.lse
— MethodLogSumExp for efficiently computing log(sum(exp.(x)))
BetaML.Utils.makeMatrix
— MethodTransform an Array{T,1} in an Array{T,2} and leave unchanged Array{T,2}.
BetaML.Utils.meanDicts
— MethodmeanDicts(dicts)
Compute the mean of the values of an array of dictionaries.
Given dicts
an array of dictionaries, meanDicts
first compute the union of the keys and then average the values. If the original valueas are probabilities (non-negative items summing to 1), the result is also a probability distribution.
BetaML.Utils.meanRelError
— MethodmeanRelError(ŷ,y;normDim=true,normRec=true,p=1)
Compute the mean relative error (l-1 based by default) between ŷ and y.
There are many ways to compute a mean relative error. In particular, if normRec (normDim) is set to true, the records (dimensions) are normalised, in the sense that it doesn't matter if a record (dimension) is bigger or smaller than the others, the relative error is first computed for each record (dimension) and then it is averaged. With both normDim
and normRec
set to false
the function returns the relative mean error; with both set to true
(default) it returns the mean relative error (i.e. with p=1 the "mean absolute percentage error (MAPE)") The parameter p
[def: 1
] controls the p-norm used to define the error.
BetaML.Utils.mish
— Methodmish(x)
https://arxiv.org/pdf/1908.08681v1.pdf
BetaML.Utils.oneHotEncoder
— FunctiononeHotEncoder(y,d;count)
Encode arrays (or arrays of arrays) of integer data as 0/1 matrices
Parameters:
y
: The data to convert (integer, array or array of arrays of integers)d
: The number of dimensions in the output matrik. [def:maximum(maximum.(Y))
]count
: Wether to count multiple instances on the same dimension/record or indicate just presence. [def:false
]
BetaML.Utils.pca
— Methodpca(X;K,error)
Perform Principal Component Analysis returning the matrix reprojected among the dimensions of maximum variance.
Parameters:
X
: The (N,D) data to reprojectK
: The number of dimensions to maintain (with K<=D) [def:nothing
]error
: The maximum approximation error that we are willing to accept [def:0.05
]
Return:
- A named tuple with:
X
: The reprojected (NxK) matrix with the column dimensions organized in descending order of of the proportion of explained varianceK
: The number of dimensions retievederror
: The actual proportion of variance not explained in the reprojected dimensionsP
: The (D,K) matrix of the eigenvectors associated to the K-largest eigenvalues used to reproject the data matrixexplVarByDim
: An array of dimensions D with the share of the cumulative variance explained by dimensions (the last element being always 1.0)
Notes:
- If
K
is provided, the parametererror
has no effect. - If one doesn't know a priori the error that she/he is willling to accept, nor the wished number of dimensions, he/she can run this pca function with
out = pca(X,K=size(X,2))
(i.e. with K=D), analise the proportions of explained cumulative variance by dimensions inout.explVarByDim
, choose the number of dimensions K according to his/her needs and finally pick from the reprojected matrix only the number of dimensions needed, i.e.out.X[:,1:K]
.
Example:
julia> X = [1 10 100; 1.1 15 120; 0.95 23 90; 0.99 17 120; 1.05 8 90; 1.1 12 95]
6×3 Array{Float64,2}:
1.0 10.0 100.0
1.1 15.0 120.0
0.95 23.0 90.0
0.99 17.0 120.0
1.05 8.0 90.0
1.1 12.0 95.0
julia> X = pca(X,error=0.05).X
6×2 Array{Float64,2}:
3.1783 100.449
6.80764 120.743
16.8275 91.3551
8.80372 120.878
1.86179 90.3363
5.51254 95.5965
BetaML.Utils.plu
— Methodplu(x;α=0.1,c=1)
Piecewise Linear Unit
https://arxiv.org/pdf/1809.09534.pdf
BetaML.Utils.polynomialKernel
— MethodPolynomial kernel parametrised with c=0
and d=2
(i.e. a quadratic kernel). For other cᵢ
and dᵢ
use K = (x,y) -> polynomialKernel(x,y,c=cᵢ,d=dᵢ)
as kernel function in the supporting algorithms
BetaML.Utils.radialKernel
— MethodRadial Kernel (aka RBF kernel) parametrised with γ=1/2. For other gammas γᵢ use K = (x,y) -> radialKernel(x,y,γ=γᵢ)
as kernel function in the supporting algorithms
BetaML.Utils.relu
— Methodrelu(x)
Rectified Linear Unit
https://www.cs.toronto.edu/~hinton/absps/reluICML.pdf
BetaML.Utils.scale
— Functionscale(x,scaleFactors;rev)
Perform a linear scaling of x using scaling factors scaleFactors
.
Parameters
x
: The (n × d) dimension matrix to scale on each dimension dscalingFactors
: A tuple of the constant and multiplicative scaling factor
respectively [def: the scaling factors needed to scale x to mean 0 and variance 1]
rev
: Wheter to invert the scaling [def:false
]
Return
- The scaled matrix
Notes:
- Also available
scale!(x,scaleFactors)
for in-place scaling. - Retrieve the scale factors with the
getScaleFactors()
function
BetaML.Utils.sigmoid
— Methodsigmoid(x)
BetaML.Utils.softmax
— Methodsoftmax (x; β=1)
The input x is a vector. Return a PMF
BetaML.Utils.softplus
— Methodsoftplus(x)
https://en.wikipedia.org/wiki/Rectifier(neuralnetworks)#Softplus
BetaML.Utils.squaredCost
— MethodsquaredCost(ŷ,y)
Compute the squared costs between a vector of prediction and one of observations as (1/2)*norm(y - ŷ)^2.
Aside the 1/2 term correspond to the squared l-2 norm distance and when it is averaged on multiple datapoints corresponds to the Mean Squared Error. It is mostly used for regression problems.
BetaML.Utils.sterling
— MethodSterling number: number of partitions of a set of n elements in k sets