The BetaML.Utils Module
BetaML.Utils
— ModuleUtils module
Provide shared utility functions for various machine learning algorithms. You don't usually need to import from this module, as each other module (Nn, Perceptron, Clusters,...) reexport it.
Module Index
BetaML.Utils.accuracy
BetaML.Utils.accuracy
BetaML.Utils.accuracy
BetaML.Utils.accuracy
BetaML.Utils.accuracy
BetaML.Utils.aic
BetaML.Utils.autoJacobian
BetaML.Utils.batch
BetaML.Utils.bic
BetaML.Utils.celu
BetaML.Utils.classCounts
BetaML.Utils.colsWithMissing
BetaML.Utils.cosine_distance
BetaML.Utils.crossEntropy
BetaML.Utils.dcelu
BetaML.Utils.delu
BetaML.Utils.dmish
BetaML.Utils.dplu
BetaML.Utils.drelu
BetaML.Utils.dsigmoid
BetaML.Utils.dsoftmax
BetaML.Utils.dsoftplus
BetaML.Utils.dtanh
BetaML.Utils.elu
BetaML.Utils.entropy
BetaML.Utils.getPermutations
BetaML.Utils.getScaleFactors
BetaML.Utils.gini
BetaML.Utils.integerDecoder
BetaML.Utils.integerEncoder
BetaML.Utils.issortable
BetaML.Utils.l1_distance
BetaML.Utils.l2_distance
BetaML.Utils.l2²_distance
BetaML.Utils.lse
BetaML.Utils.makeMatrix
BetaML.Utils.meanDicts
BetaML.Utils.meanRelError
BetaML.Utils.mish
BetaML.Utils.mode
BetaML.Utils.oneHotEncoder
BetaML.Utils.partition
BetaML.Utils.pca
BetaML.Utils.plu
BetaML.Utils.polynomialKernel
BetaML.Utils.radialKernel
BetaML.Utils.relu
BetaML.Utils.scale
BetaML.Utils.sigmoid
BetaML.Utils.softmax
BetaML.Utils.softplus
BetaML.Utils.squaredCost
BetaML.Utils.sterling
BetaML.Utils.variance
Detailed API
Base.error
— Methoderror(ŷ,y;ignoreLabels=false) - Categorical error (T vs T)
Base.error
— Methoderror(ŷ,y) - Categorical error with with probabilistic predictions of a dataset given in terms of a dictionary of probabilities (Dict{T,Float64} vs T).
Base.error
— Methoderror(ŷ,y) - Categorical error with probabilistic prediction of a single datapoint (PMF vs Int).
Base.error
— Methoderror(ŷ,y) - Categorical error with probabilistic predictions of a dataset (PMF vs Int).
Base.reshape
— Methodreshape(myNumber, dims..) - Reshape a number as a n dimensional Array
BetaML.Utils.accuracy
— Methodaccuracy(ŷ,y;ignoreLabels=false) - Categorical accuracy between two vectors (T vs T). If
BetaML.Utils.accuracy
— Methodaccuracy(ŷ,y;tol)
Categorical accuracy with probabilistic predictions of a dataset given in terms of a dictionary of probabilities (Dict{T,Float64} vs T).
Parameters:
ŷ
: A narray where each item is the estimated probability mass function in terms of a Dictionary(Item1 => Prob1, Item2 => Prob2, ...)y
: The N array with the correct category for each point $n$.tol
: The tollerance to the prediction, i.e. if considering "correct" only a prediction where the value with highest probability is the true value (tol
= 1), or consider instead the set oftol
maximum values [def:1
].
BetaML.Utils.accuracy
— Methodaccuracy(ŷ,y;tol)
Categorical accuracy with probabilistic prediction of a single datapoint (PMF vs Int).
Use the parameter tol [def: 1
] to determine the tollerance of the prediction, i.e. if considering "correct" only a prediction where the value with highest probability is the true value (tol
= 1), or consider instead the set of tol
maximum values.
BetaML.Utils.accuracy
— Methodaccuracy(ŷ,y;tol,ignoreLabels)
Categorical accuracy with probabilistic predictions of a dataset (PMF vs Int).
Parameters:
ŷ
: An (N,K) matrix of probabilities that each $\hat y_n$ record with $n \in 1,....,N$ being of category $k$ with $k \in 1,...,K$.y
: The N array with the correct category for each point $n$.tol
: The tollerance to the prediction, i.e. if considering "correct" only a prediction where the value with highest probability is the true value (tol
= 1), or consider instead the set oftol
maximum values [def:1
].ignoreLabels
: Whether to ignore the specific label order in y. Useful for unsupervised learning algorithms where the specific label order don't make sense [def: false]
BetaML.Utils.accuracy
— Methodaccuracy(ŷ,y;tol)
Categorical accuracy with probabilistic prediction of a single datapoint given in terms of a dictionary of probabilities (Dict{T,Float64} vs T).
Parameters:
ŷ
: The returned probability mass function in terms of a Dictionary(Item1 => Prob1, Item2 => Prob2, ...)tol
: The tollerance to the prediction, i.e. if considering "correct" only a prediction where the value with highest probability is the true value (tol
= 1), or consider instead the set oftol
maximum values [def:1
].
BetaML.Utils.aic
— Methodaic(lL,k) - Akaike information criterion (lower is better)
BetaML.Utils.autoJacobian
— MethodautoJacobian(f,x;nY)
Evaluate the Jacobian using AD in the form of a (nY,nX) madrix of first derivatives
Parameters:
f
: The function to compute the Jacobianx
: The input to the function where the jacobian has to be computednY
: The number of outputs of the functionf
[def:length(f(x))
]
Return values:
- An
Array{Float64,2}
of the locally evaluated Jacobian
Notes:
- The
nY
parameter is optional. If provided it avoids having to computef(x)
BetaML.Utils.batch
— Methodbatch(n,bSize;sequential=false)
Return a vector of bSize
vectors of indeces from 1
to n
. Randomly unless the optional parameter sequential
is used.
Example:
julia julia> Utils.batch(6,2,sequential=true) 3-element Array{Array{Int64,1},1}: [1, 2] [3, 4] [5, 6]
BetaML.Utils.bic
— Methodbic(lL,k,n) - Bayesian information criterion (lower is better)
BetaML.Utils.celu
— Methodcelu(x; α=1)
https://arxiv.org/pdf/1704.07483.pdf
BetaML.Utils.classCounts
— MethodclassCounts(x)
Return a dictionary that counts the number of each unique item (rows) in a dataset.
BetaML.Utils.colsWithMissing
— MethodcolsWithMissing(x)
Retuyrn an array with the ids of the columns where there is at least a missing value.
BetaML.Utils.cosine_distance
— MethodCosine distance
BetaML.Utils.crossEntropy
— MethodcrossEntropy(ŷ, y; weight)
Compute the (weighted) cross-entropy between the predicted and the sampled probability distributions.
To be used in classification problems.
BetaML.Utils.dcelu
— Methoddcelu(x; α=1)
https://arxiv.org/pdf/1704.07483.pdf
BetaML.Utils.delu
— Methoddelu(x; α=1) with α > 0
https://arxiv.org/pdf/1511.07289.pdf
BetaML.Utils.dmish
— Methoddmish(x)
https://arxiv.org/pdf/1908.08681v1.pdf
BetaML.Utils.dplu
— Methoddplu(x;α=0.1,c=1)
Piecewise Linear Unit derivative
https://arxiv.org/pdf/1809.09534.pdf
BetaML.Utils.drelu
— Methoddrelu(x)
Rectified Linear Unit
https://www.cs.toronto.edu/~hinton/absps/reluICML.pdf
BetaML.Utils.dsigmoid
— Methoddsigmoid(x)
BetaML.Utils.dsoftmax
— Methoddsoftmax(x; β=1)
Derivative of the softmax function
https://eli.thegreenplace.net/2016/the-softmax-function-and-its-derivative/
BetaML.Utils.dsoftplus
— Methoddsoftplus(x)
https://en.wikipedia.org/wiki/Rectifier(neuralnetworks)#Softplus
BetaML.Utils.dtanh
— Methoddtanh(x)
BetaML.Utils.elu
— Methodelu(x; α=1) with α > 0
https://arxiv.org/pdf/1511.07289.pdf
BetaML.Utils.entropy
— Methodentropy(x)
Calculate the entropy for a list of items (or rows).
See: https://en.wikipedia.org/wiki/Decisiontreelearning#Gini_impurity
BetaML.Utils.getPermutations
— MethodgetPermutations(v::AbstractArray{T,1};keepStructure=false)
Return a vector of either (a) all possible permutations (uncollected) or (b) just those based on the unique values of the vector
Useful to measure accuracy where you don't care about the actual name of the labels, like in unsupervised classifications (e.g. clustering)
BetaML.Utils.getScaleFactors
— MethodgetScaleFactors(x;skip)
Return the scale factors (for each dimensions) in order to scale a matrix X (n,d) such that each dimension has mean 0 and variance 1.
Parameters
x
: the (n × d) dimension matrix to scale on each dimension dskip
: an array of dimension index to skip the scaling [def:[]
]
Return
- A touple whose first elmement is the shift and the second the multiplicative
term to make the scale.
BetaML.Utils.gini
— Methodgini(x)
Calculate the Gini Impurity for a list of items (or rows).
See: https://en.wikipedia.org/wiki/Decisiontreelearning#Information_gain
BetaML.Utils.integerDecoder
— MethodintegerDecoder(y,target::AbstractVector{T};unique)
Decode an array of integers to an array of T corresponding to the elements of target
Parameters:
y
: The vector to decodetarget
: The vector of elements to use for the encodingunique
: Wethertarget
is already made of unique elements [def:true
]
Return:
- A vector of length(y) elements corresponding to the (unique) target elements at the position y
Example:
julia> integerDecoder([1, 2, 2, 3, 2, 1],["aa","cc","bb"]) # out: ["aa","cc","cc","bb","cc","aa"]
BetaML.Utils.integerEncoder
— MethodintegerEncoder(y;unique)
Encode an array of T to an array of integers using the their position in the unique vector if the input array
Parameters:
y
: The vector to encodeunique
: Wether the vector to encode is already made of unique elements [def:false
]
Return:
- A vector of [1,length(Y)] integers corresponding to the position of each element in the "unique" version of the original input
Note:
- Attention that while this function creates a ordered (and sortable) set, it is up to the user to be sure that this "property" is not indeed used in his code if the unencoded data is indeed unordered.
Example:
julia> integerEncoder(["aa","cc","cc","bb","cc","aa"]) # out: [1, 2, 2, 3, 2, 1]
BetaML.Utils.issortable
— MethodReturn wheather an array is sortable, i.e. has methos issort defined
BetaML.Utils.l1_distance
— MethodL1 norm distance (aka Manhattan Distance)
BetaML.Utils.l2_distance
— MethodEuclidean (L2) distance
BetaML.Utils.l2²_distance
— MethodSquared Euclidean (L2) distance
BetaML.Utils.lse
— MethodLogSumExp for efficiently computing log(sum(exp.(x)))
BetaML.Utils.makeMatrix
— MethodTransform an Array{T,1} in an Array{T,2} and leave unchanged Array{T,2}.
BetaML.Utils.meanDicts
— MethodmeanDicts(dicts)
Compute the mean of the values of an array of dictionaries.
Given dicts
an array of dictionaries, meanDicts
first compute the union of the keys and then average the values. If the original valueas are probabilities (non-negative items summing to 1), the result is also a probability distribution.
BetaML.Utils.meanRelError
— MethodmeanRelError(ŷ,y;normDim=true,normRec=true,p=1)
Compute the mean relative error (l-1 based by default) between ŷ and y.
There are many ways to compute a mean relative error. In particular, if normRec (normDim) is set to true, the records (dimensions) are normalised, in the sense that it doesn't matter if a record (dimension) is bigger or smaller than the others, the relative error is first computed for each record (dimension) and then it is averaged. With both normDim
and normRec
set to false
the function returns the relative mean error; with both set to true
(default) it returns the mean relative error (i.e. with p=1 the "mean absolute percentage error (MAPE)") The parameter p
[def: 1
] controls the p-norm used to define the error.
BetaML.Utils.mish
— Methodmish(x)
https://arxiv.org/pdf/1908.08681v1.pdf
BetaML.Utils.mode
— Methodmode(dicts)
Given a vector of dictionaries representing probabilities it returns the mode of each element in terms of the key
Use it to return a unique value from a multiclass classifier returning probabilities.
BetaML.Utils.oneHotEncoder
— FunctiononeHotEncoder(y,d;count)
Encode arrays (or arrays of arrays) of integer data as 0/1 matrices
Parameters:
y
: The data to convert (integer, array or array of arrays of integers)d
: The number of dimensions in the output matrik. [def:maximum(maximum.(Y))
]count
: Wether to count multiple instances on the same dimension/record or indicate just presence. [def:false
]
BetaML.Utils.partition
— Methodpartition(data,parts;shuffle=true)
Partition (by rows) one or more matrices according to the shares in parts
.
Parameters
data
: A matrix/vector or a vector of matrices/vectorsparts
: A vector of the required shares (must sum to 1)shufle
: Whether to randomly shuffle the matrices (preserving the relative order between matrices)
Example:
julia julia> x = [1:10 11:20] julia> y = collect(31:40) julia> ((xtrain,xtest),(ytrain,ytest)) = partition([x,y],[0.7,0.3])
BetaML.Utils.pca
— Methodpca(X;K,error)
Perform Principal Component Analysis returning the matrix reprojected among the dimensions of maximum variance.
Parameters:
X
: The (N,D) data to reprojectK
: The number of dimensions to maintain (with K<=D) [def:nothing
]error
: The maximum approximation error that we are willing to accept [def:0.05
]
Return:
- A named tuple with:
X
: The reprojected (NxK) matrix with the column dimensions organized in descending order of of the proportion of explained varianceK
: The number of dimensions retievederror
: The actual proportion of variance not explained in the reprojected dimensionsP
: The (D,K) matrix of the eigenvectors associated to the K-largest eigenvalues used to reproject the data matrixexplVarByDim
: An array of dimensions D with the share of the cumulative variance explained by dimensions (the last element being always 1.0)
Notes:
- If
K
is provided, the parametererror
has no effect. - If one doesn't know a priori the error that she/he is willling to accept, nor the wished number of dimensions, he/she can run this pca function with
out = pca(X,K=size(X,2))
(i.e. with K=D), analise the proportions of explained cumulative variance by dimensions inout.explVarByDim
, choose the number of dimensions K according to his/her needs and finally pick from the reprojected matrix only the number of dimensions needed, i.e.out.X[:,1:K]
.
Example:
julia> X = [1 10 100; 1.1 15 120; 0.95 23 90; 0.99 17 120; 1.05 8 90; 1.1 12 95]
6×3 Array{Float64,2}:
1.0 10.0 100.0
1.1 15.0 120.0
0.95 23.0 90.0
0.99 17.0 120.0
1.05 8.0 90.0
1.1 12.0 95.0
julia> X = pca(X,error=0.05).X
6×2 Array{Float64,2}:
3.1783 100.449
6.80764 120.743
16.8275 91.3551
8.80372 120.878
1.86179 90.3363
5.51254 95.5965
BetaML.Utils.plu
— Methodplu(x;α=0.1,c=1)
Piecewise Linear Unit
https://arxiv.org/pdf/1809.09534.pdf
BetaML.Utils.polynomialKernel
— MethodPolynomial kernel parametrised with c=0
and d=2
(i.e. a quadratic kernel). For other cᵢ
and dᵢ
use K = (x,y) -> polynomialKernel(x,y,c=cᵢ,d=dᵢ)
as kernel function in the supporting algorithms
BetaML.Utils.radialKernel
— MethodRadial Kernel (aka RBF kernel) parametrised with γ=1/2. For other gammas γᵢ use K = (x,y) -> radialKernel(x,y,γ=γᵢ)
as kernel function in the supporting algorithms
BetaML.Utils.relu
— Methodrelu(x)
Rectified Linear Unit
https://www.cs.toronto.edu/~hinton/absps/reluICML.pdf
BetaML.Utils.scale
— Functionscale(x,scaleFactors;rev)
Perform a linear scaling of x using scaling factors scaleFactors
.
Parameters
x
: The (n × d) dimension matrix to scale on each dimension dscalingFactors
: A tuple of the constant and multiplicative scaling factor
respectively [def: the scaling factors needed to scale x to mean 0 and variance 1]
rev
: Whether to invert the scaling [def:false
]
Return
- The scaled matrix
Notes:
- Also available
scale!(x,scaleFactors)
for in-place scaling. - Retrieve the scale factors with the
getScaleFactors()
function
BetaML.Utils.sigmoid
— Methodsigmoid(x)
BetaML.Utils.softmax
— Methodsoftmax (x; β=1)
The input x is a vector. Return a PMF
BetaML.Utils.softplus
— Methodsoftplus(x)
https://en.wikipedia.org/wiki/Rectifier(neuralnetworks)#Softplus
BetaML.Utils.squaredCost
— MethodsquaredCost(ŷ,y)
Compute the squared costs between a vector of prediction and one of observations as (1/2)*norm(y - ŷ)^2.
Aside the 1/2 term correspond to the squared l-2 norm distance and when it is averaged on multiple datapoints corresponds to the Mean Squared Error. It is mostly used for regression problems.
BetaML.Utils.sterling
— MethodSterling number: number of partitions of a set of n elements in k sets
BetaML.Utils.variance
— Methodvariance(x) - population variance