The BetaML.Perceptron Module

BetaML.PerceptronModule
Perceptron module

Provide linear and kernel classifiers.

See a runnable example on myBinder

  • perceptron: Train data using the classical perceptron
  • kernelPerceptron: Train data using the kernel perceptron
  • pegasos: Train data using the pegasos algorithm
  • predict: Predict data using parameters from one of the above algorithms

All algorithms are multiclass, with perceptron and pegasos employing a one-vs-all strategy, while kernelPerceptron employs a one-vs-one approach, and return a "probability" for each class in term of a dictionary for each record. Use mode(ŷ) to return a single class prediction per record.

The binary equivalent algorithms, accepting only {-1,+1} labels, are available as peceptronBinary, kernelPerceptronBinary and pegasosBinary. They are slighly faster as they don't need to be wrapped in the multi-class equivalent and return a more informative output.

The multi-class versions are available in the MLJ framework as PerceptronClassifier,KernelPerceptronClassifier and PegasosClassifier respectivly.

Module Index

Detailed API

BetaML.Perceptron.kernelPerceptronMethod

kernelPerceptron(x,y;K,T,α,nMsgs,shuffle)

Train a multiclass kernel classifier "perceptron" algorithm based on x and y.

kernelPerceptron is a (potentially) non-linear perceptron-style classifier employing user-defined kernel funcions. Multiclass is supported using a one-vs-one approach.

Parameters:

  • x: Feature matrix of the training data (n × d)
  • y: Associated labels of the training data, in the format of ⨦ 1
  • K: Kernel function to employ. See ?radialKernel or ?polynomialKernelfor details or check ?BetaML.Utils to verify if other kernels are defined (you can alsways define your own kernel) [def: radialKernel]
  • T: Maximum number of iterations (aka "epochs") across the whole set (if the set is not fully classified earlier) [def: 100]
  • α: Initial distribution of the errors [def: zeros(length(y))]
  • nMsg: Maximum number of messages to show if all iterations are done [def: 0]
  • shuffle: Whether to randomly shuffle the data at each iteration [def: false]

Return a named tuple with:

  • x: The x data (eventually shuffled if shuffle=true)
  • y: The label
  • α: The errors associated to each record
  • classes: The labels classes encountered in the training

Notes:

  • The trained model can then be used to make predictions using the function predict().
  • This model is available in the MLJ framework as the KernelPerceptronClassifier

Example:

julia> model  = kernelPerceptron([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
julia> ŷtrain = Perceptron.predict(xtrain,model.x,model.y,model.α, model.classes,K=model.K)
julia> ϵtrain = error(ytrain, mode(ŷtrain))
BetaML.Perceptron.kernelPerceptronBinaryMethod

kernelPerceptronBinary(x,y;K,T,α,nMsgs,shuffle)

Train a multiclass kernel classifier "perceptron" algorithm based on x and y

Parameters:

  • x: Feature matrix of the training data (n × d)
  • y: Associated labels of the training data, in the format of ⨦ 1
  • K: Kernel function to employ. See ?radialKernel or ?polynomialKernelfor details or check ?BetaML.Utils to verify if other kernels are defined (you can alsways define your own kernel) [def: radialKernel]
  • T: Maximum number of iterations across the whole set (if the set is not fully classified earlier) [def: 1000]
  • α: Initial distribution of the errors [def: zeros(length(y))]
  • nMsg: Maximum number of messages to show if all iterations are done
  • shuffle: Whether to randomly shuffle the data at each iteration [def: false]

Return a named tuple with:

  • x: the x data (eventually shuffled if shuffle=true)
  • y: the label
  • α: the errors associated to each record
  • errors: the number of errors in the last iteration
  • besterrors: the minimum number of errors in classifying the data ever reached
  • iterations: the actual number of iterations performed
  • separated: a flag if the data has been successfully separated

Notes:

  • The trained data can then be used to make predictions using the function predict(). If the option shuffle has been used, it is important to use there the returned (x,y,α) as these would have been shuffle compared with the original (x,y).
  • Please see @kernelPerceptron for a multi-class version

Example:

julia> model = kernelPerceptronBinary([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
BetaML.Perceptron.pegasosMethod

pegasos(x,y;θ,θ₀,λ,η,T,nMsgs,shuffle,forceOrigin,returnMeanHyperplane)

Train the multiclass classifier "pegasos" algorithm according to x (features) and y (labels)

Pegasos is a linear, gradient-based classifier. Multiclass is supported using a one-vs-all approach.

Parameters:

  • x: Feature matrix of the training data (n × d)
  • y: Associated labels of the training data, can be in any format (string, integers..)
  • θ: Initial value of the weights (parameter) [def: zeros(d)]
  • θ₀: Initial value of the weight (parameter) associated to the constant term [def: 0]
  • λ: Multiplicative term of the learning rate
  • η: Learning rate [def: (t -> 1/sqrt(t))]
  • T: Maximum number of iterations across the whole set (if the set is not fully classified earlier) [def: 1000]
  • nMsg: Maximum number of messages to show if all iterations are done
  • shuffle: Whether to randomly shuffle the data at each iteration [def: false]
  • forceOrigin: Whehter to force θ₀ to remain zero [def: false]
  • returnMeanHyperplane: Whether to return the average hyperplane coefficients instead of the average ones [def: false]

Return a named tuple with:

  • θ: The weights of the classifier
  • θ₀: The weight of the classifier associated to the constant term
  • classes: The classes (unique values) of y

Notes:

  • The trained parameters can then be used to make predictions using the function predict().
  • This model is available in the MLJ framework as the PegasosClassifier

Example:

julia> model = pegasos([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
julia> ŷ     = predict([2.1 3.1; 7.3 5.2], model.θ, model.θ₀, model.classes)
BetaML.Perceptron.pegasosBinaryMethod
pegasosBinary(x,y;θ,θ₀,λ,η,T,nMsgs,shuffle,forceOrigin)

Train the peagasos algorithm based on x and y (labels)

Parameters:

  • x: Feature matrix of the training data (n × d)
  • y: Associated labels of the training data, in the format of ⨦ 1
  • θ: Initial value of the weights (parameter) [def: zeros(d)]
  • θ₀: Initial value of the weight (parameter) associated to the constant term [def: 0]
  • λ: Multiplicative term of the learning rate
  • η: Learning rate [def: (t -> 1/sqrt(t))]
  • T: Maximum number of iterations across the whole set (if the set is not fully classified earlier) [def: 1000]
  • nMsg: Maximum number of messages to show if all iterations are done
  • shuffle: Whether to randomly shuffle the data at each iteration [def: false]
  • forceOrigin: Whether to force θ₀ to remain zero [def: false]

Return a named tuple with:

  • θ: The final weights of the classifier
  • θ₀: The final weight of the classifier associated to the constant term
  • avgθ: The average weights of the classifier
  • avgθ₀: The average weight of the classifier associated to the constant term
  • errors: The number of errors in the last iteration
  • besterrors: The minimum number of errors in classifying the data ever reached
  • iterations: The actual number of iterations performed
  • separated: Weather the data has been successfully separated

Notes:

  • The trained parameters can then be used to make predictions using the function predict().

Example:

julia> pegasos([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
BetaML.Perceptron.perceptronMethod
perceptron(x,y;θ,θ₀,T,nMsgs,shuffle,forceOrigin,returnMeanHyperplane)

Train the multiclass classifier "perceptron" algorithm based on x and y (labels).

The perceptron is a linear classifier. Multiclass is supported using a one-vs-all approach.

Parameters:

  • x: Feature matrix of the training data (n × d)
  • y: Associated labels of the training data, can be in any format (string, integers..)
  • θ: Initial value of the weights (parameter) [def: zeros(d)]
  • θ₀: Initial value of the weight (parameter) associated to the constant term [def: 0]
  • T: Maximum number of iterations across the whole set (if the set is not fully classified earlier) [def: 1000]
  • nMsg: Maximum number of messages to show if all iterations are done [def: 0]
  • shuffle: Whether to randomly shuffle the data at each iteration [def: false]
  • forceOrigin: Whether to force θ₀ to remain zero [def: false]
  • returnMeanHyperplane: Whether to return the average hyperplane coefficients instead of the final ones [def: false]

Return a named tuple with:

  • θ: The weights of the classifier
  • θ₀: The weight of the classifier associated to the constant term
  • classes: The classes (unique values) of y

Notes:

  • The trained parameters can then be used to make predictions using the function predict().
  • This model is available in the MLJ framework as the PerceptronClassifier

Example:

julia> model = perceptron([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
julia> ŷ     = predict([2.1 3.1; 7.3 5.2], model.θ, model.θ₀, model.classes)
BetaML.Perceptron.perceptronBinaryMethod
perceptronBinary(x,y;θ,θ₀,T,nMsgs,shuffle,forceOrigin)

Train the binary classifier "perceptron" algorithm based on x and y (labels)

Parameters:

  • x: Feature matrix of the training data (n × d)
  • y: Associated labels of the training data, in the format of ⨦ 1
  • θ: Initial value of the weights (parameter) [def: zeros(d)]
  • θ₀: Initial value of the weight (parameter) associated to the constant term [def: 0]
  • T: Maximum number of iterations across the whole set (if the set is not fully classified earlier) [def: 1000]
  • nMsg: Maximum number of messages to show if all iterations are done
  • shuffle: Whether to randomly shuffle the data at each iteration [def: false]
  • forceOrigin: Whether to force θ₀ to remain zero [def: false]

Return a named tuple with:

  • θ: The final weights of the classifier
  • θ₀: The final weight of the classifier associated to the constant term
  • avgθ: The average weights of the classifier
  • avgθ₀: The average weight of the classifier associated to the constant term
  • errors: The number of errors in the last iteration
  • besterrors: The minimum number of errors in classifying the data ever reached
  • iterations: The actual number of iterations performed
  • separated: Weather the data has been successfully separated

Notes:

  • The trained parameters can then be used to make predictions using the function predict().

Example:

julia> model = perceptronBinary([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
BetaML.Perceptron.predictFunction

predict(x,θ,θ₀)

Predict a binary label {-1,1} given the feature vector and the linear coefficients

Parameters:

  • x: Feature matrix of the training data (n × d)
  • θ: The trained parameters
  • θ₀: The trained bias barameter [def: 0]

Return :

  • y: Vector of the predicted labels

Example:

julia> predict([1.1 2.1; 5.3 4.2; 1.8 1.7], [3.2,1.2])
BetaML.Perceptron.predictMethod

predict(x,xtrain,ytrain,α;K)

Predict a binary label {-1,1} given the feature vector and the training data together with their errors (as trained by a kernel perceptron algorithm)

Parameters:

  • x: Feature matrix of the training data (n × d)
  • xtrain: The feature vectors used for the training
  • ytrain: The labels of the training set
  • α: The errors associated to each record
  • K: The kernel function used for the training and to be used for the prediction [def: radialKernel]

Return :

  • y: Vector of the predicted labels

Example:

julia> predict([1.1 2.1; 5.3 4.2; 1.8 1.7], [3.2,1.2])
BetaML.Perceptron.predictMethod

predict(x,xtrain,ytrain,α,classes;K)

Predict a multiclass label given the new feature vector and a trained kernel perceptron model.

Parameters:

  • x: Feature matrix of the training data (n × d)
  • xtrain: A vector of the feature matrix used for training each of the one-vs-one class matches (i.e. model.x)
  • ytrain: A vector of the label vector used for training each of the one-vs-one class matches (i.e. model.y)
  • α: A vector of the errors associated to each record (i.e. model.α)
  • classes: The overal classes encountered in training (i.e. model.classes)
  • K: The kernel function used for the training and to be used for the prediction [def: radialKernel]

Return :

  • : Vector of dictionaries label=>probability (warning: it isn't really a probability, it is just the standardized number of matches "won" by this class compared with the other classes)

Notes:

  • Use mode(ŷ) if you want a single predicted label per record

Example:

julia> model  = kernelPerceptron([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
julia> ŷtrain = Perceptron.predict([10 10; 2.2 2.5],model.x,model.y,model.α, model.classes,K=model.K)
BetaML.Perceptron.predictMethod

predict(x,θ,θ₀,classes)

Predict a multiclass label given the feature vector, the linear coefficients and the classes vector

Parameters:

  • x: Feature matrix of the training data (n × d)
  • θ: Vector of the trained parameters for each one-vs-all model (i.e. model.θ)
  • θ₀: Vector of the trained bias barameter for each one-vs-all model (i.e. model.θ₀)
  • classes: The overal classes encountered in training (i.e. model.classes)

Return :

  • : Vector of dictionaries label=>probability

Notes:

  • Use mode(ŷ) if you want a single predicted label per record

Example:

```julia julia> model = perceptron([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1]) julia> ŷtrain = predict([10 10; 2.5 2.5],model.θ,model.θ₀, model.classes)