# Algorithms

This section contains all main algorithms of the package. These are the ones typical users will call.

The typical signature for these algorithms is:

`my_algorithm(f, grad!, lmo, x0)`

## Standard algorithms

`FrankWolfe.frank_wolfe`

— Method`frank_wolfe(f, grad!, lmo, x0; ...)`

Simplest form of the Frank-Wolfe algorithm. Returns a tuple `(x, v, primal, dual_gap, traj_data)`

with:

`x`

final iterate`v`

last vertex from the LMO`primal`

primal value`f(x)`

`dual_gap`

final Frank-Wolfe gap`traj_data`

vector of trajectory information.

`FrankWolfe.lazified_conditional_gradient`

— Method`lazified_conditional_gradient(f, grad!, lmo_base, x0; ...)`

Similar to `FrankWolfe.frank_wolfe`

but lazyfying the LMO: each call is stored in a cache, which is looked up first for a good-enough direction. The cache used is a `FrankWolfe.MultiCacheLMO`

or a `FrankWolfe.VectorCacheLMO`

depending on whether the provided `cache_size`

option is finite.

`FrankWolfe.stochastic_frank_wolfe`

— Method`stochastic_frank_wolfe(f::StochasticObjective, lmo, x0; ...)`

Stochastic version of Frank-Wolfe, evaluates the objective and gradient stochastically, implemented through the `FrankWolfe.StochasticObjective`

interface.

Keyword arguments include `batch_size`

to pass a fixed `batch_size`

or a `batch_iterator`

implementing `batch_size = FrankWolfe.batchsize_iterate(batch_iterator)`

for algorithms like Variance-reduced and projection-free stochastic optimization, E Hazan, H Luo, 2016.

Similarly, a constant `momentum`

can be passed or replaced by a `momentum_iterator`

implementing `momentum = FrankWolfe.momentum_iterate(momentum_iterator)`

.

`FrankWolfe.block_coordinate_frank_wolfe`

— Function`block_coordinate_frank_wolfe(f, grad!, lmo::ProductLMO{N}, x0; ...) where {N}`

Block-coordinate version of the Frank-Wolfe algorithm. Minimizes objective `f`

over the product of feasible domains specified by the `lmo`

. The optional argument the `update_order`

is of type `FrankWolfe.BlockCoordinateUpdateOrder`

and controls the order in which the blocks are updated. The argument `update_step`

is a single instance or tuple of `FrankWolfe.UpdateStep`

and defines which FW-algorithms to use to update the iterates in the different blocks.

The method returns a tuple `(x, v, primal, dual_gap, traj_data)`

with:

`x`

cartesian product of final iterates`v`

cartesian product of last vertices of the LMOs`primal`

primal value`f(x)`

`dual_gap`

final Frank-Wolfe gap`traj_data`

vector of trajectory information.

See S. Lacoste-Julien, M. Jaggi, M. Schmidt, and P. Pletscher 2013 and A. Beck, E. Pauwels and S. Sabach 2015 for more details about Block-Coordinate Frank-Wolfe.

## Active-set based methods

The following algorithms maintain the representation of the iterates as a convex combination of vertices.

### Away-step

`FrankWolfe.away_frank_wolfe`

— Method`away_frank_wolfe(f, grad!, lmo, x0; ...)`

Frank-Wolfe with away steps. The algorithm maintains the current iterate as a convex combination of vertices in the `FrankWolfe.ActiveSet`

data structure. See M. Besançon, A. Carderera and S. Pokutta 2021 for illustrations of away steps.

### Pairwise Frank-Wolfe

`FrankWolfe.blended_pairwise_conditional_gradient`

— Method`blended_pairwise_conditional_gradient(f, grad!, lmo, x0; kwargs...)`

Implements the BPCG algorithm from Tsuji, Tanaka, Pokutta (2021). The method uses an active set of current vertices. Unlike away-step, it transfers weight from an away vertex to another vertex of the active set.

`FrankWolfe.blended_pairwise_conditional_gradient`

— Method`blended_pairwise_conditional_gradient(f, grad!, lmo, active_set::AbstractActiveSet; kwargs...)`

Warm-starts BPCG with a pre-defined `active_set`

.

`FrankWolfe.pairwise_frank_wolfe`

— Method`pairwise_frank_wolfe(f, grad!, lmo, x0; ...)`

Frank-Wolfe with pairwise steps. The algorithm maintains the current iterate as a convex combination of vertices in the `FrankWolfe.ActiveSet`

data structure. See M. Besançon, A. Carderera and S. Pokutta 2021 for illustrations of away steps. Unlike away-step, it transfers weight from an away vertex to another vertex.

### Blended Conditional Gradient

`FrankWolfe.accelerated_simplex_gradient_descent_over_probability_simplex`

— Method`accelerated_simplex_gradient_descent_over_probability_simplex`

Minimizes an objective function over the unit probability simplex until the Strong-Wolfe gap is below tolerance using Nesterov's accelerated gradient descent.

`FrankWolfe.blended_conditional_gradient`

— Method`blended_conditional_gradient(f, grad!, lmo, x0)`

Entry point for the Blended Conditional Gradient algorithm. See Braun, Gábor, et al. "Blended conditonal gradients" ICML 2019. The method works on an active set like `FrankWolfe.away_frank_wolfe`

, performing gradient descent over the convex hull of active vertices, removing vertices when their weight drops to 0 and adding new vertices by calling the linear oracle in a lazy fashion.

`FrankWolfe.build_reduced_problem`

— Method`build_reduced_problem(atoms::AbstractVector{<:AbstractVector}, hessian, weights, gradient, tolerance)`

Given an active set formed by vectors , a (constant) Hessian and a gradient constructs a quadratic problem over the unit probability simplex that is equivalent to minimizing the original function over the convex hull of the active set. If λ are the barycentric coordinates of dimension equal to the cardinality of the active set, the objective function is:

`f(λ) = reduced_linear^T λ + 0.5 * λ^T reduced_hessian λ`

In the case where we find that the current iterate has a strong-Wolfe gap over the convex hull of the active set that is below the tolerance we return nothing (as there is nothing to do).

`FrankWolfe.lp_separation_oracle`

— MethodReturns either a tuple `(y, val)`

with `y`

an atom from the active set satisfying the progress criterion and `val`

the corresponding gap `dot(y, direction)`

or the same tuple with `y`

from the LMO.

`inplace_loop`

controls whether the iterate type allows in-place writes. `kwargs`

are passed on to the LMO oracle.

`FrankWolfe.minimize_over_convex_hull!`

— Method`minimize_over_convex_hull!`

Given a function f with gradient grad! and an active set active_set this function will minimize the function over the convex hull of the active set until the strong-wolfe gap over the active set is below tolerance.

It will either directly minimize over the convex hull using simplex gradient descent, or it will transform the problem to barycentric coordinates and minimize over the unit probability simplex using gradient descent or Nesterov's accelerated gradient descent.

`FrankWolfe.projection_simplex_sort`

— Method`projection_simplex_sort(x; s=1.0)`

Perform a projection onto the probability simplex of radius `s`

using a sorting algorithm.

`FrankWolfe.simplex_gradient_descent_over_convex_hull`

— Method`simplex_gradient_descent_over_convex_hull(f, grad!, gradient, active_set, tolerance, t, time_start, non_simplex_iter)`

Minimizes an objective function over the convex hull of the active set until the Strong-Wolfe gap is below tolerance using simplex gradient descent.

`FrankWolfe.simplex_gradient_descent_over_probability_simplex`

— Method`simplex_gradient_descent_over_probability_simplex`

Minimizes an objective function over the unit probability simplex until the Strong-Wolfe gap is below tolerance using gradient descent.

`FrankWolfe.strong_frankwolfe_gap`

— MethodChecks the strong Frank-Wolfe gap for the reduced problem.

`FrankWolfe.strong_frankwolfe_gap_probability_simplex`

— Method`strong_frankwolfe_gap_probability_simplex`

Compute the Strong-Wolfe gap over the unit probability simplex given a gradient.

### Blended Pairwise Conditional Gradient

## Alternating Methods

Problems over intersections of convex sets, i.e.

\[\min_{x \in \bigcap_{i=1}^n P_i} f(x),\]

pose a challenge as one has to combine the information of two or more LMOs.

`FrankWolfe.alternating_linear_minimization`

converts the problem into a series of subproblems over single sets. To find a point within the intersection, one minimizes both the distance to the iterates of the other subproblems and the original objective function.

`FrankWolfe.alternating_projections`

solves feasibility problems over intersections of feasible regions.

`FrankWolfe.alternating_linear_minimization`

— Method`alternating_linear_minimization(bc_algo::BlockCoordinateMethod, f, grad!, lmos::NTuple{N,LinearMinimizationOracle}, x0; ...) where {N}`

Alternating Linear Minimization minimizes the objective `f`

over the intersections of the feasible domains specified by `lmos`

. The tuple `x0`

defines the initial points for each domain. Returns a tuple `(x, v, primal, dual_gap, infeas, traj_data)`

with:

`x`

cartesian product of final iterates`v`

cartesian product of last vertices of the LMOs`primal`

primal value`f(x)`

`dual_gap`

final Frank-Wolfe gap`infeas`

sum of squared, pairwise distances between iterates`traj_data`

vector of trajectory information.

`FrankWolfe.alternating_projections`

— Method`alternating_projections(lmos::NTuple{N,LinearMinimizationOracle}, x0; ...) where {N}`

Computes a point in the intersection of feasible domains specified by `lmos`

. Returns a tuple `(x, v, dual_gap, infeas, traj_data)`

with:

`x`

cartesian product of final iterates`v`

cartesian product of last vertices of the LMOs`dual_gap`

final Frank-Wolfe gap`infeas`

sum of squared, pairwise distances between iterates`traj_data`

vector of trajectory information.