The BetaML.Utils Module

BetaML.UtilsModule
Utils module

Provide shared utility functions for various machine learning algorithms. You don't usually need to import from this module, as each other module (Nn, Perceptron, Clusters,...) reexport it.

Module Index

Detailed API

Base.errorMethod

error(ŷ,y;ignoreLabels=false) - Categorical error (T vs T)

Base.errorMethod

error(ŷ,y) - Categorical error with with probabilistic predictions of a dataset given in terms of a dictionary of probabilities (Dict{T,Float64} vs T).

Base.errorMethod

error(ŷ,y) - Categorical error with probabilistic prediction of a single datapoint (PMF vs Int).

Base.errorMethod

error(ŷ,y) - Categorical error with probabilistic predictions of a dataset (PMF vs Int).

Base.reshapeMethod

reshape(myNumber, dims..) - Reshape a number as a n dimensional Array

BetaML.Utils.accuracyMethod

accuracy(ŷ,y;ignoreLabels=false) - Categorical accuracy between two vectors (T vs T). If

BetaML.Utils.accuracyMethod

accuracy(ŷ,y;tol)

Categorical accuracy with probabilistic predictions of a dataset given in terms of a dictionary of probabilities (Dict{T,Float64} vs T).

Parameters:

  • : A narray where each item is the estimated probability mass function in terms of a Dictionary(Item1 => Prob1, Item2 => Prob2, ...)
  • y: The N array with the correct category for each point $n$.
  • tol: The tollerance to the prediction, i.e. if considering "correct" only a prediction where the value with highest probability is the true value (tol = 1), or consider instead the set of tol maximum values [def: 1].
BetaML.Utils.accuracyMethod
accuracy(ŷ,y;tol)

Categorical accuracy with probabilistic prediction of a single datapoint (PMF vs Int).

Use the parameter tol [def: 1] to determine the tollerance of the prediction, i.e. if considering "correct" only a prediction where the value with highest probability is the true value (tol = 1), or consider instead the set of tol maximum values.

BetaML.Utils.accuracyMethod

accuracy(ŷ,y;tol,ignoreLabels)

Categorical accuracy with probabilistic predictions of a dataset (PMF vs Int).

Parameters:

  • : An (N,K) matrix of probabilities that each $\hat y_n$ record with $n \in 1,....,N$ being of category $k$ with $k \in 1,...,K$.
  • y: The N array with the correct category for each point $n$.
  • tol: The tollerance to the prediction, i.e. if considering "correct" only a prediction where the value with highest probability is the true value (tol = 1), or consider instead the set of tol maximum values [def: 1].
  • ignoreLabels: Whether to ignore the specific label order in y. Useful for unsupervised learning algorithms where the specific label order don't make sense [def: false]
BetaML.Utils.accuracyMethod
accuracy(ŷ,y;tol)

Categorical accuracy with probabilistic prediction of a single datapoint given in terms of a dictionary of probabilities (Dict{T,Float64} vs T).

Parameters:

  • : The returned probability mass function in terms of a Dictionary(Item1 => Prob1, Item2 => Prob2, ...)
  • tol: The tollerance to the prediction, i.e. if considering "correct" only a prediction where the value with highest probability is the true value (tol = 1), or consider instead the set of tol maximum values [def: 1].
BetaML.Utils.aicMethod

aic(lL,k) - Akaike information criterion (lower is better)

BetaML.Utils.autoJacobianMethod

autoJacobian(f,x;nY)

Evaluate the Jacobian using AD in the form of a (nY,nX) madrix of first derivatives

Parameters:

  • f: The function to compute the Jacobian
  • x: The input to the function where the jacobian has to be computed
  • nY: The number of outputs of the function f [def: length(f(x))]

Return values:

  • An Array{Float64,2} of the locally evaluated Jacobian

Notes:

  • The nY parameter is optional. If provided it avoids having to compute f(x)
BetaML.Utils.batchMethod

batch(n,bSize;sequential=false)

Return a vector of bSize vectors of indeces from 1 to n. Randomly unless the optional parameter sequential is used.

Example:

julia julia> Utils.batch(6,2,sequential=true) 3-element Array{Array{Int64,1},1}: [1, 2] [3, 4] [5, 6]

BetaML.Utils.bicMethod

bic(lL,k,n) - Bayesian information criterion (lower is better)

BetaML.Utils.classCountsMethod

classCounts(x)

Return a dictionary that counts the number of each unique item (rows) in a dataset.

BetaML.Utils.colsWithMissingMethod
colsWithMissing(x)

Retuyrn an array with the ids of the columns where there is at least a missing value.

BetaML.Utils.crossEntropyMethod

crossEntropy(ŷ, y; weight)

Compute the (weighted) cross-entropy between the predicted and the sampled probability distributions.

To be used in classification problems.

BetaML.Utils.deluMethod

delu(x; α=1) with α > 0

https://arxiv.org/pdf/1511.07289.pdf

BetaML.Utils.dpluMethod

dplu(x;α=0.1,c=1)

Piecewise Linear Unit derivative

https://arxiv.org/pdf/1809.09534.pdf

BetaML.Utils.dreluMethod

drelu(x)

Rectified Linear Unit

https://www.cs.toronto.edu/~hinton/absps/reluICML.pdf

BetaML.Utils.dsoftmaxMethod

dsoftmax(x; β=1)

Derivative of the softmax function

https://eli.thegreenplace.net/2016/the-softmax-function-and-its-derivative/

BetaML.Utils.dsoftplusMethod

dsoftplus(x)

https://en.wikipedia.org/wiki/Rectifier(neuralnetworks)#Softplus

BetaML.Utils.eluMethod

elu(x; α=1) with α > 0

https://arxiv.org/pdf/1511.07289.pdf

BetaML.Utils.entropyMethod

entropy(x)

Calculate the entropy for a list of items (or rows).

See: https://en.wikipedia.org/wiki/Decisiontreelearning#Gini_impurity

BetaML.Utils.getPermutationsMethod
getPermutations(v::AbstractArray{T,1};keepStructure=false)

Return a vector of either (a) all possible permutations (uncollected) or (b) just those based on the unique values of the vector

Useful to measure accuracy where you don't care about the actual name of the labels, like in unsupervised classifications (e.g. clustering)

BetaML.Utils.getScaleFactorsMethod
getScaleFactors(x;skip)

Return the scale factors (for each dimensions) in order to scale a matrix X (n,d) such that each dimension has mean 0 and variance 1.

Parameters

  • x: the (n × d) dimension matrix to scale on each dimension d
  • skip: an array of dimension index to skip the scaling [def: []]

Return

  • A touple whose first elmement is the shift and the second the multiplicative

term to make the scale.

BetaML.Utils.giniMethod

gini(x)

Calculate the Gini Impurity for a list of items (or rows).

See: https://en.wikipedia.org/wiki/Decisiontreelearning#Information_gain

BetaML.Utils.integerDecoderMethod
integerDecoder(y,target::AbstractVector{T};unique)

Decode an array of integers to an array of T corresponding to the elements of target

Parameters:

  • y: The vector to decode
  • target: The vector of elements to use for the encoding
  • unique: Wether target is already made of unique elements [def: true]

Return:

  • A vector of length(y) elements corresponding to the (unique) target elements at the position y

Example:

julia> integerDecoder([1, 2, 2, 3, 2, 1],["aa","cc","bb"]) # out: ["aa","cc","cc","bb","cc","aa"]
BetaML.Utils.integerEncoderMethod
integerEncoder(y;unique)

Encode an array of T to an array of integers using the their position in the unique vector if the input array

Parameters:

  • y: The vector to encode
  • unique: Wether the vector to encode is already made of unique elements [def: false]

Return:

  • A vector of [1,length(Y)] integers corresponding to the position of each element in the "unique" version of the original input

Note:

  • Attention that while this function creates a ordered (and sortable) set, it is up to the user to be sure that this "property" is not indeed used in his code if the unencoded data is indeed unordered.

Example:

julia> integerEncoder(["aa","cc","cc","bb","cc","aa"]) # out: [1, 2, 2, 3, 2, 1]
BetaML.Utils.lseMethod

LogSumExp for efficiently computing log(sum(exp.(x)))

BetaML.Utils.meanDictsMethod

meanDicts(dicts)

Compute the mean of the values of an array of dictionaries.

Given dicts an array of dictionaries, meanDicts first compute the union of the keys and then average the values. If the original valueas are probabilities (non-negative items summing to 1), the result is also a probability distribution.

BetaML.Utils.meanRelErrorMethod

meanRelError(ŷ,y;normDim=true,normRec=true,p=1)

Compute the mean relative error (l-1 based by default) between ŷ and y.

There are many ways to compute a mean relative error. In particular, if normRec (normDim) is set to true, the records (dimensions) are normalised, in the sense that it doesn't matter if a record (dimension) is bigger or smaller than the others, the relative error is first computed for each record (dimension) and then it is averaged. With both normDim and normRec set to false the function returns the relative mean error; with both set to true (default) it returns the mean relative error (i.e. with p=1 the "mean absolute percentage error (MAPE)") The parameter p [def: 1] controls the p-norm used to define the error.

BetaML.Utils.modeMethod

mode(dicts)

Given a vector of dictionaries representing probabilities it returns the mode of each element in terms of the key

Use it to return a unique value from a multiclass classifier returning probabilities.

BetaML.Utils.oneHotEncoderFunction
oneHotEncoder(y,d;count)

Encode arrays (or arrays of arrays) of integer data as 0/1 matrices

Parameters:

  • y: The data to convert (integer, array or array of arrays of integers)
  • d: The number of dimensions in the output matrik. [def: maximum(maximum.(Y))]
  • count: Wether to count multiple instances on the same dimension/record or indicate just presence. [def: false]
BetaML.Utils.partitionMethod
partition(data,parts;shuffle=true)

Partition (by rows) one or more matrices according to the shares in parts.

Parameters

  • data: A matrix/vector or a vector of matrices/vectors
  • parts: A vector of the required shares (must sum to 1)
  • shufle: Whether to randomly shuffle the matrices (preserving the relative order between matrices)

Example:

julia julia> x = [1:10 11:20] julia> y = collect(31:40) julia> ((xtrain,xtest),(ytrain,ytest)) = partition([x,y],[0.7,0.3])

BetaML.Utils.pcaMethod

pca(X;K,error)

Perform Principal Component Analysis returning the matrix reprojected among the dimensions of maximum variance.

Parameters:

  • X : The (N,D) data to reproject
  • K : The number of dimensions to maintain (with K<=D) [def: nothing]
  • error: The maximum approximation error that we are willing to accept [def: 0.05]

Return:

  • A named tuple with:
    • X: The reprojected (NxK) matrix with the column dimensions organized in descending order of of the proportion of explained variance
    • K: The number of dimensions retieved
    • error: The actual proportion of variance not explained in the reprojected dimensions
    • P: The (D,K) matrix of the eigenvectors associated to the K-largest eigenvalues used to reproject the data matrix
    • explVarByDim: An array of dimensions D with the share of the cumulative variance explained by dimensions (the last element being always 1.0)

Notes:

  • If K is provided, the parameter error has no effect.
  • If one doesn't know a priori the error that she/he is willling to accept, nor the wished number of dimensions, he/she can run this pca function with out = pca(X,K=size(X,2)) (i.e. with K=D), analise the proportions of explained cumulative variance by dimensions in out.explVarByDim, choose the number of dimensions K according to his/her needs and finally pick from the reprojected matrix only the number of dimensions needed, i.e. out.X[:,1:K].

Example:

julia> X = [1 10 100; 1.1 15 120; 0.95 23 90; 0.99 17 120; 1.05 8 90; 1.1 12 95]
6×3 Array{Float64,2}:
 1.0   10.0  100.0
 1.1   15.0  120.0
 0.95  23.0   90.0
 0.99  17.0  120.0
 1.05   8.0   90.0
 1.1   12.0   95.0
 julia> X = pca(X,error=0.05).X
6×2 Array{Float64,2}:
  3.1783   100.449
  6.80764  120.743
 16.8275    91.3551
  8.80372  120.878
  1.86179   90.3363
  5.51254   95.5965
BetaML.Utils.pluMethod

plu(x;α=0.1,c=1)

Piecewise Linear Unit

https://arxiv.org/pdf/1809.09534.pdf

BetaML.Utils.polynomialKernelMethod

Polynomial kernel parametrised with c=0 and d=2 (i.e. a quadratic kernel). For other cᵢ and dᵢ use K = (x,y) -> polynomialKernel(x,y,c=cᵢ,d=dᵢ) as kernel function in the supporting algorithms

BetaML.Utils.radialKernelMethod

Radial Kernel (aka RBF kernel) parametrised with γ=1/2. For other gammas γᵢ use K = (x,y) -> radialKernel(x,y,γ=γᵢ) as kernel function in the supporting algorithms

BetaML.Utils.reluMethod

relu(x)

Rectified Linear Unit

https://www.cs.toronto.edu/~hinton/absps/reluICML.pdf

BetaML.Utils.scaleFunction
scale(x,scaleFactors;rev)

Perform a linear scaling of x using scaling factors scaleFactors.

Parameters

  • x: The (n × d) dimension matrix to scale on each dimension d
  • scalingFactors: A tuple of the constant and multiplicative scaling factor

respectively [def: the scaling factors needed to scale x to mean 0 and variance 1]

  • rev: Whether to invert the scaling [def: false]

Return

  • The scaled matrix

Notes:

  • Also available scale!(x,scaleFactors) for in-place scaling.
  • Retrieve the scale factors with the getScaleFactors() function
BetaML.Utils.softplusMethod

softplus(x)

https://en.wikipedia.org/wiki/Rectifier(neuralnetworks)#Softplus

BetaML.Utils.squaredCostMethod

squaredCost(ŷ,y)

Compute the squared costs between a vector of prediction and one of observations as (1/2)*norm(y - ŷ)^2.

Aside the 1/2 term correspond to the squared l-2 norm distance and when it is averaged on multiple datapoints corresponds to the Mean Squared Error. It is mostly used for regression problems.