CW(model, x, t; dist = euclidean, c = 0.1)

Carlini & Wagner's (CW) method for generating adversarials through the optimisation of a loss function against a target class. Here we consider the F6 variant loss function. (


  • model: The model to attack.
  • x: The original input data
  • t: Index label corrosponding to the target class.
  • dist: The distance measure to use L0, L2, L∞. Assumes this is from the Distances.jl library or some other callable function.
  • c: value for the contribution of the missclassification in the error function.
DeepFool(model, x, overshoot = 0.02, max_iter = 50)

Moosavi-Dezfooli et al.'s ( DeepFool method.

An algorithm to determine the minimum perturbation needed to change the class assignment of the image. This algorithm is useful then for computing a robustness metric of classifiers, where as other algorithms (such as FGSM) may return sub-optimal solutions for generating an adversarial.

The algorithm operates in a greedy way, such that, its not guaranteed to converge to the smallest possible perturbation (that results in an adversarial). Despite this shortcoming, it can often yield a class approximation.

The python/matlab implementations mentioned in the paper can be found at:


  • model: The flux model to attack before the softmax function.
  • image: An array of input images to create adversarial examples for. (size, WHC)
  • overshoot: The halting criteria to prevent vanishing gradient.
  • max_iter: The maximum iterations for the algorithm.
FGSM(model, loss, x, y; ϵ = 0.1, clamp_range = (0, 1))

Fast Gradient Sign Method (FGSM) is a method of creating adversarial examples by pushing the input in the direction of the gradient and bounded by the ε parameter.

This method was proposed by Goodfellow et al. 2014 (


  • model: The model to base the attack upon.
  • loss: The loss function to use. This assumes that the loss function includes the predict function, i.e. loss(x, y) = crossentropy(model(x), y).
  • x: The input to be perturbed by the FGSM algorithm.
  • y: The 'true' label of the input.
  • ϵ: The amount of perturbation to apply.
  • clamp_range: Tuple consisting of the lower and upper values to clamp the input.
JSMA(model, x, t; Υ, θ)

Jacobian Saliency Map Algorithm (JSMA), craft adversarial examples by modifying a very small amount of pixels. These pixels are selected via the jacobian matrix of the output w.r.t. the input of the network. (


  • model: The model to create adversarial examples for.
  • x: The original input data
  • t: Index corrosponding to the target class (this is a targeted attack).
  • Υ: The maximum amount of distortion
  • θ: The amount by which each feature is perturbed.
PGD(model, loss, x, y; ϵ = 10, step_size = 0.1, iters = 100, clamp_range = (0, 1))

Projected Gradient Descent (PGD) is an itrative variant of FGSM with a random point. For every step the FGSM algorithm moves the input in the direction of the gradient bounded in the l∞ norm. (


  • model: The model to base teh attack upon.
  • loss: the loss function to use, assuming that it includes the prediction function i.e. loss(x, y) = crossentropy(m(x), y)
  • x: The input to be perturbed.
  • y: the ground truth for x.
  • ϵ: The bound around x.
  • step_size: The ϵ value in the FGSM step.
  • iters: The maximum number of iterations to run the algorithm for.
  • clamp_range: The lower and upper values to clamp the input to.



  • model: The flux model to attack.
  • x: Original input data to perturb
  • y: The ground truth index for x.
  • 'ϵ': The maximum amount of perturbation per dimension.
saliency_map(j, Γ, t)

Determine the optimal pixels to change based upon the saliency via the jacobian. This method is used as part of the JSMA algorithm. It returns the cartesian index of the best pixels to modify.


  • j: The jacobian matrix of outputs w.r.t. inputs
  • Γ: The matrix of pixels where 0 denotes that the pixel has yet to be modified. I.e. the search space
  • t: Target class index.