The BetaML.Perceptron Module
BetaML.Perceptron
— ModulePerceptron module
Provide linear and kernel classifiers.
See a runnable example on myBinder
perceptron
: Train data using the classical perceptronkernelPerceptron
: Train data using the kernel perceptronpegasos
: Train data using the pegasos algorithmpredict
: Predict data using parameters from one of the above algorithms
All algorithms are multiclass, with perceptron
and pegasos
employing a one-vs-all strategy, while kernelPerceptron
employs a one-vs-one approach, and return a "probability" for each class in term of a dictionary for each record. Use mode(ŷ)
to return a single class prediction per record.
The binary equivalent algorithms, accepting only {-1,+1}
labels, are available as peceptronBinary
, kernelPerceptronBinary
and pegasosBinary
. They are slighly faster as they don't need to be wrapped in the multi-class equivalent and return a more informative output.
The multi-class versions are available in the MLJ framework as PerceptronClassifier
,KernelPerceptronClassifier
and PegasosClassifier
respectivly.
Module Index
BetaML.Perceptron.kernelPerceptron
BetaML.Perceptron.kernelPerceptronBinary
BetaML.Perceptron.pegasos
BetaML.Perceptron.pegasosBinary
BetaML.Perceptron.perceptron
BetaML.Perceptron.perceptronBinary
BetaML.Perceptron.predict
BetaML.Perceptron.predict
BetaML.Perceptron.predict
BetaML.Perceptron.predict
Detailed API
BetaML.Perceptron.kernelPerceptron
— MethodkernelPerceptron(x,y;K,T,α,nMsgs,shuffle)
Train a multiclass kernel classifier "perceptron" algorithm based on x and y.
kernelPerceptron
is a (potentially) non-linear perceptron-style classifier employing user-defined kernel funcions. Multiclass is supported using a one-vs-one approach.
Parameters:
x
: Feature matrix of the training data (n × d)y
: Associated labels of the training data, in the format of ⨦ 1K
: Kernel function to employ. See?radialKernel
or?polynomialKernel
for details or check?BetaML.Utils
to verify if other kernels are defined (you can alsways define your own kernel) [def:radialKernel
]T
: Maximum number of iterations (aka "epochs") across the whole set (if the set is not fully classified earlier) [def: 100]α
: Initial distribution of the errors [def:zeros(length(y))
]nMsg
: Maximum number of messages to show if all iterations are done [def:0
]shuffle
: Whether to randomly shuffle the data at each iteration [def:false
]
Return a named tuple with:
x
: The x data (eventually shuffled ifshuffle=true
)y
: The labelα
: The errors associated to each recordclasses
: The labels classes encountered in the training
Notes:
- The trained model can then be used to make predictions using the function
predict()
. - This model is available in the MLJ framework as the
KernelPerceptronClassifier
Example:
julia> model = kernelPerceptron([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
julia> ŷtrain = Perceptron.predict(xtrain,model.x,model.y,model.α, model.classes,K=model.K)
julia> ϵtrain = error(ytrain, mode(ŷtrain))
BetaML.Perceptron.kernelPerceptronBinary
— MethodkernelPerceptronBinary(x,y;K,T,α,nMsgs,shuffle)
Train a multiclass kernel classifier "perceptron" algorithm based on x and y
Parameters:
x
: Feature matrix of the training data (n × d)y
: Associated labels of the training data, in the format of ⨦ 1K
: Kernel function to employ. See?radialKernel
or?polynomialKernel
for details or check?BetaML.Utils
to verify if other kernels are defined (you can alsways define your own kernel) [def:radialKernel
]T
: Maximum number of iterations across the whole set (if the set is not fully classified earlier) [def: 1000]α
: Initial distribution of the errors [def:zeros(length(y))
]nMsg
: Maximum number of messages to show if all iterations are doneshuffle
: Whether to randomly shuffle the data at each iteration [def:false
]
Return a named tuple with:
x
: the x data (eventually shuffled ifshuffle=true
)y
: the labelα
: the errors associated to each recorderrors
: the number of errors in the last iterationbesterrors
: the minimum number of errors in classifying the data ever reachediterations
: the actual number of iterations performedseparated
: a flag if the data has been successfully separated
Notes:
- The trained data can then be used to make predictions using the function
predict()
. If the optionshuffle
has been used, it is important to use there the returned (x,y,α) as these would have been shuffle compared with the original (x,y). - Please see @kernelPerceptron for a multi-class version
Example:
julia> model = kernelPerceptronBinary([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
BetaML.Perceptron.pegasos
— Methodpegasos(x,y;θ,θ₀,λ,η,T,nMsgs,shuffle,forceOrigin,returnMeanHyperplane)
Train the multiclass classifier "pegasos" algorithm according to x (features) and y (labels)
Pegasos is a linear, gradient-based classifier. Multiclass is supported using a one-vs-all approach.
Parameters:
x
: Feature matrix of the training data (n × d)y
: Associated labels of the training data, can be in any format (string, integers..)θ
: Initial value of the weights (parameter) [def:zeros(d)
]θ₀
: Initial value of the weight (parameter) associated to the constant term [def:0
]λ
: Multiplicative term of the learning rateη
: Learning rate [def: (t -> 1/sqrt(t))]T
: Maximum number of iterations across the whole set (if the set is not fully classified earlier) [def: 1000]nMsg
: Maximum number of messages to show if all iterations are doneshuffle
: Whether to randomly shuffle the data at each iteration [def:false
]forceOrigin
: Whehter to forceθ₀
to remain zero [def:false
]returnMeanHyperplane
: Whether to return the average hyperplane coefficients instead of the average ones [def:false
]
Return a named tuple with:
θ
: The weights of the classifierθ₀
: The weight of the classifier associated to the constant termclasses
: The classes (unique values) of y
Notes:
- The trained parameters can then be used to make predictions using the function
predict()
. - This model is available in the MLJ framework as the
PegasosClassifier
Example:
julia> model = pegasos([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
julia> ŷ = predict([2.1 3.1; 7.3 5.2], model.θ, model.θ₀, model.classes)
BetaML.Perceptron.pegasosBinary
— MethodpegasosBinary(x,y;θ,θ₀,λ,η,T,nMsgs,shuffle,forceOrigin)
Train the peagasos algorithm based on x and y (labels)
Parameters:
x
: Feature matrix of the training data (n × d)y
: Associated labels of the training data, in the format of ⨦ 1θ
: Initial value of the weights (parameter) [def:zeros(d)
]θ₀
: Initial value of the weight (parameter) associated to the constant term [def:0
]λ
: Multiplicative term of the learning rateη
: Learning rate [def: (t -> 1/sqrt(t))]T
: Maximum number of iterations across the whole set (if the set is not fully classified earlier) [def: 1000]nMsg
: Maximum number of messages to show if all iterations are doneshuffle
: Whether to randomly shuffle the data at each iteration [def:false
]forceOrigin
: Whether to forceθ₀
to remain zero [def:false
]
Return a named tuple with:
θ
: The final weights of the classifierθ₀
: The final weight of the classifier associated to the constant termavgθ
: The average weights of the classifieravgθ₀
: The average weight of the classifier associated to the constant termerrors
: The number of errors in the last iterationbesterrors
: The minimum number of errors in classifying the data ever reachediterations
: The actual number of iterations performedseparated
: Weather the data has been successfully separated
Notes:
- The trained parameters can then be used to make predictions using the function
predict()
.
Example:
julia> pegasos([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
BetaML.Perceptron.perceptron
— Methodperceptron(x,y;θ,θ₀,T,nMsgs,shuffle,forceOrigin,returnMeanHyperplane)
Train the multiclass classifier "perceptron" algorithm based on x and y (labels).
The perceptron is a linear classifier. Multiclass is supported using a one-vs-all approach.
Parameters:
x
: Feature matrix of the training data (n × d)y
: Associated labels of the training data, can be in any format (string, integers..)θ
: Initial value of the weights (parameter) [def:zeros(d)
]θ₀
: Initial value of the weight (parameter) associated to the constant term [def:0
]T
: Maximum number of iterations across the whole set (if the set is not fully classified earlier) [def: 1000]nMsg
: Maximum number of messages to show if all iterations are done [def:0
]shuffle
: Whether to randomly shuffle the data at each iteration [def:false
]forceOrigin
: Whether to forceθ₀
to remain zero [def:false
]returnMeanHyperplane
: Whether to return the average hyperplane coefficients instead of the final ones [def:false
]
Return a named tuple with:
θ
: The weights of the classifierθ₀
: The weight of the classifier associated to the constant termclasses
: The classes (unique values) of y
Notes:
- The trained parameters can then be used to make predictions using the function
predict()
. - This model is available in the MLJ framework as the
PerceptronClassifier
Example:
julia> model = perceptron([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
julia> ŷ = predict([2.1 3.1; 7.3 5.2], model.θ, model.θ₀, model.classes)
BetaML.Perceptron.perceptronBinary
— MethodperceptronBinary(x,y;θ,θ₀,T,nMsgs,shuffle,forceOrigin)
Train the binary classifier "perceptron" algorithm based on x and y (labels)
Parameters:
x
: Feature matrix of the training data (n × d)y
: Associated labels of the training data, in the format of ⨦ 1θ
: Initial value of the weights (parameter) [def:zeros(d)
]θ₀
: Initial value of the weight (parameter) associated to the constant term [def:0
]T
: Maximum number of iterations across the whole set (if the set is not fully classified earlier) [def: 1000]nMsg
: Maximum number of messages to show if all iterations are doneshuffle
: Whether to randomly shuffle the data at each iteration [def:false
]forceOrigin
: Whether to forceθ₀
to remain zero [def:false
]
Return a named tuple with:
θ
: The final weights of the classifierθ₀
: The final weight of the classifier associated to the constant termavgθ
: The average weights of the classifieravgθ₀
: The average weight of the classifier associated to the constant termerrors
: The number of errors in the last iterationbesterrors
: The minimum number of errors in classifying the data ever reachediterations
: The actual number of iterations performedseparated
: Weather the data has been successfully separated
Notes:
- The trained parameters can then be used to make predictions using the function
predict()
.
Example:
julia> model = perceptronBinary([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
BetaML.Perceptron.predict
— Functionpredict(x,θ,θ₀)
Predict a binary label {-1,1} given the feature vector and the linear coefficients
Parameters:
x
: Feature matrix of the training data (n × d)θ
: The trained parametersθ₀
: The trained bias barameter [def:0
]
Return :
y
: Vector of the predicted labels
Example:
julia> predict([1.1 2.1; 5.3 4.2; 1.8 1.7], [3.2,1.2])
BetaML.Perceptron.predict
— Methodpredict(x,xtrain,ytrain,α;K)
Predict a binary label {-1,1} given the feature vector and the training data together with their errors (as trained by a kernel perceptron algorithm)
Parameters:
x
: Feature matrix of the training data (n × d)xtrain
: The feature vectors used for the trainingytrain
: The labels of the training setα
: The errors associated to each recordK
: The kernel function used for the training and to be used for the prediction [def:radialKernel
]
Return :
y
: Vector of the predicted labels
Example:
julia> predict([1.1 2.1; 5.3 4.2; 1.8 1.7], [3.2,1.2])
BetaML.Perceptron.predict
— Methodpredict(x,xtrain,ytrain,α,classes;K)
Predict a multiclass label given the new feature vector and a trained kernel perceptron model.
Parameters:
x
: Feature matrix of the training data (n × d)xtrain
: A vector of the feature matrix used for training each of the one-vs-one class matches (i.e.model.x
)ytrain
: A vector of the label vector used for training each of the one-vs-one class matches (i.e.model.y
)α
: A vector of the errors associated to each record (i.e.model.α
)classes
: The overal classes encountered in training (i.e.model.classes
)K
: The kernel function used for the training and to be used for the prediction [def:radialKernel
]
Return :
ŷ
: Vector of dictionarieslabel=>probability
(warning: it isn't really a probability, it is just the standardized number of matches "won" by this class compared with the other classes)
Notes:
- Use
mode(ŷ)
if you want a single predicted label per record
Example:
julia> model = kernelPerceptron([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1])
julia> ŷtrain = Perceptron.predict([10 10; 2.2 2.5],model.x,model.y,model.α, model.classes,K=model.K)
BetaML.Perceptron.predict
— Methodpredict(x,θ,θ₀,classes)
Predict a multiclass label given the feature vector, the linear coefficients and the classes vector
Parameters:
x
: Feature matrix of the training data (n × d)θ
: Vector of the trained parameters for each one-vs-all model (i.e.model.θ
)θ₀
: Vector of the trained bias barameter for each one-vs-all model (i.e.model.θ₀
)classes
: The overal classes encountered in training (i.e.model.classes
)
Return :
ŷ
: Vector of dictionarieslabel=>probability
Notes:
- Use
mode(ŷ)
if you want a single predicted label per record
Example:
```julia julia> model = perceptron([1.1 2.1; 5.3 4.2; 1.8 1.7], [-1,1,-1]) julia> ŷtrain = predict([10 10; 2.5 2.5],model.θ,model.θ₀, model.classes)