The BetaML.Perceptron Module

BetaML.Perceptron — Module

Perceptron module

Provide linear and kernel classifiers.

See a runnable example on myBinder

perceptron: Train data using the classical perceptron
kernelPerceptron: Train data using the kernel perceptron
pegasos: Train data using the pegasos algorithm
predict: Predict data using parameters from one of the above algorithms

All algorithms are multiclass, with perceptron and pegasos employing a one-vs-all strategy, while kernelPerceptron employs a one-vs-one approach, and return a "probability" for each class in term of a dictionary for each record. Use mode(ŷ) to return a single class prediction per record.

The binary equivalent algorithms, accepting only {-1,+1} labels, are available as peceptronBinary, kernelPerceptronBinary and pegasosBinary. They are slighly faster as they don't need to be wrapped in the multi-class equivalent and return a more informative output.

The multi-class versions are available in the MLJ framework as PerceptronClassifier,KernelPerceptronClassifier and PegasosClassifier respectivly.

Module Index

Detailed API

BetaML.Perceptron.kernelPerceptron — Method

kernelPerceptron(x,y;K,T,α,nMsgs,shuffle)

Train a multiclass kernel classifier "perceptron" algorithm based on x and y.

kernelPerceptron is a (potentially) non-linear perceptron-style classifier employing user-defined kernel funcions. Multiclass is supported using a one-vs-one approach.

Parameters:

x: Feature matrix of the training data (n × d)
y: Associated labels of the training data, in the format of ⨦ 1
K: Kernel function to employ. See ?radialKernel or ?polynomialKernelfor details or check ?BetaML.Utils to verify if other kernels are defined (you can alsways define your own kernel) [def: radialKernel]
T: Maximum number of iterations (aka "epochs") across the whole set (if the set is not fully classified earlier) [def: 100]
α: Initial distribution of the errors [def: zeros(length(y))]
nMsg: Maximum number of messages to show if all iterations are done [def: 0]
shuffle: Whether to randomly shuffle the data at each iteration [def: false]

Return a named tuple with:

x: The x data (eventually shuffled if shuffle=true)
y: The label
α: The errors associated to each record
classes: The labels classes encountered in the training

Notes:

The trained model can then be used to make predictions using the function predict().
This model is available in the MLJ framework as the KernelPerceptronClassifier

Example:

julia> model  = kernelPerceptron([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
julia> ŷtrain = Perceptron.predict(xtrain,model.x,model.y,model.α, model.classes,K=model.K)
julia> ϵtrain = error(ytrain, mode(ŷtrain))

BetaML.Perceptron.kernelPerceptronBinary — Method

kernelPerceptronBinary(x,y;K,T,α,nMsgs,shuffle)

Train a multiclass kernel classifier "perceptron" algorithm based on x and y

Parameters:

x: Feature matrix of the training data (n × d)
y: Associated labels of the training data, in the format of ⨦ 1
K: Kernel function to employ. See ?radialKernel or ?polynomialKernelfor details or check ?BetaML.Utils to verify if other kernels are defined (you can alsways define your own kernel) [def: radialKernel]
T: Maximum number of iterations across the whole set (if the set is not fully classified earlier) [def: 1000]
α: Initial distribution of the errors [def: zeros(length(y))]
nMsg: Maximum number of messages to show if all iterations are done
shuffle: Whether to randomly shuffle the data at each iteration [def: false]

Return a named tuple with:

x: the x data (eventually shuffled if shuffle=true)
y: the label
α: the errors associated to each record
errors: the number of errors in the last iteration
besterrors: the minimum number of errors in classifying the data ever reached
iterations: the actual number of iterations performed
separated: a flag if the data has been successfully separated

Notes:

The trained data can then be used to make predictions using the function predict(). If the option shuffle has been used, it is important to use there the returned (x,y,α) as these would have been shuffle compared with the original (x,y).
Please see @kernelPerceptron for a multi-class version

Example:

julia> model = kernelPerceptronBinary([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])

BetaML.Perceptron.pegasos — Method

pegasos(x,y;θ,θ₀,λ,η,T,nMsgs,shuffle,forceOrigin,returnMeanHyperplane)

Train the multiclass classifier "pegasos" algorithm according to x (features) and y (labels)

Pegasos is a linear, gradient-based classifier. Multiclass is supported using a one-vs-all approach.

Parameters:

x: Feature matrix of the training data (n × d)
y: Associated labels of the training data, can be in any format (string, integers..)
θ: Initial value of the weights (parameter) [def: zeros(d)]
θ₀: Initial value of the weight (parameter) associated to the constant term [def: 0]
λ: Multiplicative term of the learning rate
η: Learning rate [def: (t -> 1/sqrt(t))]
T: Maximum number of iterations across the whole set (if the set is not fully classified earlier) [def: 1000]
nMsg: Maximum number of messages to show if all iterations are done
shuffle: Whether to randomly shuffle the data at each iteration [def: false]
forceOrigin: Whehter to force θ₀ to remain zero [def: false]
returnMeanHyperplane: Whether to return the average hyperplane coefficients instead of the average ones [def: false]

Return a named tuple with:

θ: The weights of the classifier
θ₀: The weight of the classifier associated to the constant term
classes: The classes (unique values) of y

Notes:

The trained parameters can then be used to make predictions using the function predict().
This model is available in the MLJ framework as the PegasosClassifier

Example:

julia> model = pegasos([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
julia> ŷ     = predict([2.1 3.1; 7.3 5.2], model.θ, model.θ₀, model.classes)

BetaML.Perceptron.pegasosBinary — Method

pegasosBinary(x,y;θ,θ₀,λ,η,T,nMsgs,shuffle,forceOrigin)

Train the peagasos algorithm based on x and y (labels)

Parameters:

x: Feature matrix of the training data (n × d)
y: Associated labels of the training data, in the format of ⨦ 1
θ: Initial value of the weights (parameter) [def: zeros(d)]
θ₀: Initial value of the weight (parameter) associated to the constant term [def: 0]
λ: Multiplicative term of the learning rate
η: Learning rate [def: (t -> 1/sqrt(t))]
T: Maximum number of iterations across the whole set (if the set is not fully classified earlier) [def: 1000]
nMsg: Maximum number of messages to show if all iterations are done
shuffle: Whether to randomly shuffle the data at each iteration [def: false]
forceOrigin: Whether to force θ₀ to remain zero [def: false]

Return a named tuple with:

θ: The final weights of the classifier
θ₀: The final weight of the classifier associated to the constant term
avgθ: The average weights of the classifier
avgθ₀: The average weight of the classifier associated to the constant term
errors: The number of errors in the last iteration
besterrors: The minimum number of errors in classifying the data ever reached
iterations: The actual number of iterations performed
separated: Weather the data has been successfully separated

Notes:

The trained parameters can then be used to make predictions using the function predict().

Example:

julia> pegasos([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])

BetaML.Perceptron.perceptron — Method

perceptron(x,y;θ,θ₀,T,nMsgs,shuffle,forceOrigin,returnMeanHyperplane)

Train the multiclass classifier "perceptron" algorithm based on x and y (labels).

The perceptron is a linear classifier. Multiclass is supported using a one-vs-all approach.

Parameters:

x: Feature matrix of the training data (n × d)
y: Associated labels of the training data, can be in any format (string, integers..)
θ: Initial value of the weights (parameter) [def: zeros(d)]
θ₀: Initial value of the weight (parameter) associated to the constant term [def: 0]
T: Maximum number of iterations across the whole set (if the set is not fully classified earlier) [def: 1000]
nMsg: Maximum number of messages to show if all iterations are done [def: 0]
shuffle: Whether to randomly shuffle the data at each iteration [def: false]
forceOrigin: Whether to force θ₀ to remain zero [def: false]
returnMeanHyperplane: Whether to return the average hyperplane coefficients instead of the final ones [def: false]

Return a named tuple with:

θ: The weights of the classifier
θ₀: The weight of the classifier associated to the constant term
classes: The classes (unique values) of y

Notes:

The trained parameters can then be used to make predictions using the function predict().
This model is available in the MLJ framework as the PerceptronClassifier

Example:

julia> model = perceptron([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
julia> ŷ     = predict([2.1 3.1; 7.3 5.2], model.θ, model.θ₀, model.classes)

BetaML.Perceptron.perceptronBinary — Method

perceptronBinary(x,y;θ,θ₀,T,nMsgs,shuffle,forceOrigin)

Train the binary classifier "perceptron" algorithm based on x and y (labels)

Parameters:

x: Feature matrix of the training data (n × d)
y: Associated labels of the training data, in the format of ⨦ 1
θ: Initial value of the weights (parameter) [def: zeros(d)]
θ₀: Initial value of the weight (parameter) associated to the constant term [def: 0]
T: Maximum number of iterations across the whole set (if the set is not fully classified earlier) [def: 1000]
nMsg: Maximum number of messages to show if all iterations are done
shuffle: Whether to randomly shuffle the data at each iteration [def: false]
forceOrigin: Whether to force θ₀ to remain zero [def: false]

Return a named tuple with:

θ: The final weights of the classifier
θ₀: The final weight of the classifier associated to the constant term
avgθ: The average weights of the classifier
avgθ₀: The average weight of the classifier associated to the constant term
errors: The number of errors in the last iteration
besterrors: The minimum number of errors in classifying the data ever reached
iterations: The actual number of iterations performed
separated: Weather the data has been successfully separated

Notes:

The trained parameters can then be used to make predictions using the function predict().

Example:

julia> model = perceptronBinary([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])

BetaML.Perceptron.predict — Function

predict(x,θ,θ₀)

Predict a binary label {-1,1} given the feature vector and the linear coefficients

Parameters:

x: Feature matrix of the training data (n × d)
θ: The trained parameters
θ₀: The trained bias barameter [def: 0]

Return :

y: Vector of the predicted labels

Example:

julia> predict([1.1 2.1; 5.3 4.2; 1.8 1.7], [3.2,1.2])

BetaML.Perceptron.predict — Method

predict(x,xtrain,ytrain,α;K)

Predict a binary label {-1,1} given the feature vector and the training data together with their errors (as trained by a kernel perceptron algorithm)

Parameters:

x: Feature matrix of the training data (n × d)
xtrain: The feature vectors used for the training
ytrain: The labels of the training set
α: The errors associated to each record
K: The kernel function used for the training and to be used for the prediction [def: radialKernel]

Return :

y: Vector of the predicted labels

Example:

julia> predict([1.1 2.1; 5.3 4.2; 1.8 1.7], [3.2,1.2])

BetaML.Perceptron.predict — Method

predict(x,xtrain,ytrain,α,classes;K)

Predict a multiclass label given the new feature vector and a trained kernel perceptron model.

Parameters:

x: Feature matrix of the training data (n × d)
xtrain: A vector of the feature matrix used for training each of the one-vs-one class matches (i.e. model.x)
ytrain: A vector of the label vector used for training each of the one-vs-one class matches (i.e. model.y)
α: A vector of the errors associated to each record (i.e. model.α)
classes: The overal classes encountered in training (i.e. model.classes)
K: The kernel function used for the training and to be used for the prediction [def: radialKernel]

Return :

ŷ: Vector of dictionaries label=>probability (warning: it isn't really a probability, it is just the standardized number of matches "won" by this class compared with the other classes)

Notes:

Use mode(ŷ) if you want a single predicted label per record

Example:

julia> model  = kernelPerceptron([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
julia> ŷtrain = Perceptron.predict([10 10; 2.2 2.5],model.x,model.y,model.α, model.classes,K=model.K)

BetaML.Perceptron.predict — Method

predict(x,θ,θ₀,classes)

Predict a multiclass label given the feature vector, the linear coefficients and the classes vector

Parameters:

x: Feature matrix of the training data (n × d)
θ: Vector of the trained parameters for each one-vs-all model (i.e. model.θ)
θ₀: Vector of the trained bias barameter for each one-vs-all model (i.e. model.θ₀)
classes: The overal classes encountered in training (i.e. model.classes)

Return :

ŷ: Vector of dictionaries label=>probability

Notes:

Use mode(ŷ) if you want a single predicted label per record

Example:

```julia julia> model = perceptron([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1]) julia> ŷtrain = predict([10 10; 2.5 2.5],model.θ,model.θ₀, model.classes)