Main.check_gradients

Spectrahedron

This example shows an optimization problem over the spectraplex:

\[S = \{X \in \mathbb{S}_+^n, Tr(X) = 1\}\]

with $\mathbb{S}_+^n$ the set of positive semidefinite matrices. Linear optimization with symmetric objective $D$ over the spetraplex consists in computing the leading eigenvector of $D$.

The package also exposes UnitSpectrahedronLMO which corresponds to the feasible set:

\[S_u = \{X \in \mathbb{S}_+^n, Tr(X) \leq 1\}\]

using FrankWolfe
using LinearAlgebra
using Random
using SparseArrays

The objective function will be the symmetric squared distance to a set of known or observed entries $Y_{ij}$ of the matrix.

\[f(X) = \sum_{(i,j) \in L} 1/2 (X_{ij} - Y_{ij})^2\]

Setting up the input data, objective, and gradient

Dimension, number of iterations and number of known entries:

n = 1500
k = 5000
n_entries = 1000

Random.seed!(41)

const entry_indices = unique!([minmax(rand(1:n, 2)...) for _ in 1:n_entries])
const entry_values = randn(length(entry_indices))

function f(X)
    r = zero(eltype(X))
    for (idx, (i, j)) in enumerate(entry_indices)
        r += 1 / 2 * (X[i, j] - entry_values[idx])^2
        r += 1 / 2 * (X[j, i] - entry_values[idx])^2
    end
    return r / length(entry_values)
end

function grad!(storage, X)
    storage .= 0
    for (idx, (i, j)) in enumerate(entry_indices)
        storage[i, j] += (X[i, j] - entry_values[idx])
        storage[j, i] += (X[j, i] - entry_values[idx])
    end
    return storage ./= length(entry_values)
end
grad! (generic function with 1 method)

Note that the ensure_symmetry = false argument to SpectraplexLMO. It skips an additional step making the used direction symmetric. It is not necessary when the gradient is a LinearAlgebra.Symmetric (or more rarely a LinearAlgebra.Diagonal or LinearAlgebra.UniformScaling).

const lmo = FrankWolfe.SpectraplexLMO(1.0, n, false)
const x0 = FrankWolfe.compute_extreme_point(lmo, spzeros(n, n))

target_tolerance = 1e-8;

Running standard and lazified Frank-Wolfe

Xfinal, Vfinal, primal, dual_gap, trajectory = FrankWolfe.frank_wolfe(
    f,
    grad!,
    lmo,
    x0,
    max_iteration=k,
    line_search=FrankWolfe.MonotonicStepSize(),
    print_iter=k / 10,
    memory_mode=FrankWolfe.InplaceEmphasis(),
    verbose=true,
    trajectory=true,
    epsilon=target_tolerance,
)

Xfinal, Vfinal, primal, dual_gap, trajectory_lazy = FrankWolfe.lazified_conditional_gradient(
    f,
    grad!,
    lmo,
    x0,
    max_iteration=k,
    line_search=FrankWolfe.MonotonicStepSize(),
    print_iter=k / 10,
    memory_mode=FrankWolfe.InplaceEmphasis(),
    verbose=true,
    trajectory=true,
    epsilon=target_tolerance,
);

Vanilla Frank-Wolfe Algorithm.
MEMORY_MODE: FrankWolfe.InplaceEmphasis() STEPSIZE: MonotonicStepSize EPSILON: 1.0e-8 MAXITERATION: 5000 TYPE: Float64
MOMENTUM: nothing GRADIENTTYPE: Nothing
LMO: FrankWolfe.SpectraplexLMO{Float64, Matrix{Float64}}
[ Info: In memory_mode memory iterates are written back into x0!

-------------------------------------------------------------------------------------------------
  Type     Iteration         Primal           Dual       Dual Gap           Time         It/sec
-------------------------------------------------------------------------------------------------
     I             1   1.011661e+00   1.007083e+00   4.578027e-03   0.000000e+00            Inf
  Last            25   1.007310e+00   1.007310e+00   8.841097e-09   4.436958e+00   5.634491e+00
-------------------------------------------------------------------------------------------------

Lazified Conditional Gradient (Frank-Wolfe + Lazification).
MEMORY_MODE: FrankWolfe.InplaceEmphasis() STEPSIZE: MonotonicStepSize EPSILON: 1.0e-8 MAXITERATION: 5000 lazy_tolerance: 2.0 TYPE: Float64
GRADIENTTYPE: Matrix{Float64} CACHESIZE Inf GREEDYCACHE: false
LMO: FrankWolfe.VectorCacheLMO{FrankWolfe.SpectraplexLMO{Float64, Matrix{Float64}}, FrankWolfe.RankOneMatrix{Float64, Vector{Float64}, Vector{Float64}}}
[ Info: In memory_mode memory iterates are written back into x0!

----------------------------------------------------------------------------------------------------------------
  Type     Iteration         Primal           Dual       Dual Gap           Time         It/sec     Cache Size
----------------------------------------------------------------------------------------------------------------
     I             1   1.011661e+00   1.007083e+00   4.578027e-03   0.000000e+00            Inf              1
    LD             2   1.007313e+00   1.007309e+00   3.640886e-06   2.665229e-01   7.504045e+00              2
    LD             3   1.007311e+00   1.007310e+00   9.829895e-07   4.604273e-01   6.515687e+00              3
    LD             4   1.007310e+00   1.007310e+00   4.829083e-07   6.781315e-01   5.898561e+00              4
    LD             6   1.007310e+00   1.007310e+00   1.919785e-07   1.089287e+00   5.508189e+00              5
    LD             9   1.007310e+00   1.007310e+00   7.989407e-08   1.552019e+00   5.798898e+00              6
    LD            13   1.007310e+00   1.007310e+00   3.686371e-08   2.178834e+00   5.966494e+00              7
    LD            19   1.007310e+00   1.007310e+00   1.681344e-08   3.205609e+00   5.927110e+00              8
    LD            27   1.007310e+00   1.007310e+00   8.190915e-09   4.630769e+00   5.830566e+00              9
  Last            27   1.007310e+00   1.007310e+00   7.605836e-09   5.037828e+00   5.359453e+00             10
----------------------------------------------------------------------------------------------------------------

Plotting the resulting trajectories

data = [trajectory, trajectory_lazy]
label = ["FW", "LCG"]
plot_trajectories(data, label, xscalelog=true)

This page was generated using Literate.jl.