Tutorial

In this tutorial you will learn

  • to build a Problem, which includes
    • creating the objective
    • creating a constraint
    • combining them into a Problem
  • to solve a Problem
  • to find an exact optimal solution

For explanation on more intricate parts of the interface, see Advanced modeling; for additional examples see the Examples section.

Running example

We will explain the interface using an example from polynomial optimization. Suppose we want to find the global minimum (or a lower bound on this) of the polynomial

\[f(u,v,t,w) = -ut^3+4vt^2w + 4 utw^2 + 2vw^3 + 4 ut+ 4t^2 -10 vw - 10 w^2 +2\]

in the domain $[-1/2, 1/2]^4$. We can relax the problem using a sum-of-squares characterization:

\[\begin{aligned} \text{maximize} \quad & M & \\ \text{subject to} \quad & f-M & = s_0 + \sum_i w_i s_i. \\ \end{aligned}\]

where the $w_i$ are polynomial weights describing the domain and $s_i$ are sum-of-squares polynomials. In this example we use

\[\begin{aligned} [-1/2, 1/2]^4 = \{(u,v,t,w): p(u), p(v), p(t), p(w) \geq 0\} \end{aligned}\]

with $p(x) = 1/4 - x^2$. Given a vector $m$ whose entries form a basis of the space of polynomials up to degree $d$, we can parametrize a sum-of-squares polynomial $s$ of degree $2d$ by a positive semidefinite matrix $Y$ with

\[ s = ⟨ Y, mm^{\sf T} ⟩.\]

The problem parameters are now

  • the polynomial $f$
  • the weight polynomials $w_i$
  • the degree of the relaxation $d$

We will start with building a function that takes these parameters, and constructs the Problem. Since the user will give the polynomial f and the weights, we need to extract the polynomial ring and the polynomial variables.

using ClusteredLowRankSolver, Nemo
function min_f(f, ws, d; basis_change=true)
    # extract the polynomial ring
    R = parent(f)
    x = gens(R)
    n = nvars(R)

Defining the objective

First we will define the Objective. This consists of a constant offset, a dictionary with the coefficient matrices for the positive semidefinite matrix variables appearing in the objective, and a dictionary with the scalar coefficients for the free variables used in the objective. In this example,

  • the constant offset is 0,
  • the dictionary of matrix coefficients is empty because we do not use the matrix variables in the objective,
  • and the dictionary of scalar coefficients has one entry corresponding to $M$

This gives the first part of our function:

    # Define the objective
    obj = Objective(0, Dict(), Dict(:M => 1))
ClusteredLowRankSolver.ObjectiveType
Objective(constant, matrixcoeff::Dict, freecoeff::Dict)

The objective for the Problem.

Arguments:

  • constant: A constant offset of the objective value.
  • matrixcoeff: A Dict with positive semidefinite matrix variable names as keys and the objective matrices as values.
  • freecoeff: A Dict with free variable names as keys and the coefficients as values.

Defining the constraints

Now we will define the constraints. In ClusteredLowRankSolver a Constraint is of the form

\[ \sum_i \langle A_i, Y_i \rangle + \sum_j b_j y_j = c. \]

To define it, we need the right-hand side $c$, the matrix coefficients $A_i$ for the positive semidefinite matrix variables, and the coefficients $b_j$ for the free variables. Here the scalars $c$ and $b_j$ and the entries of $A_i$ can either be constants or polynomials. When these are polynomials, we also need a unisolvent set of samples. In this example,

  • the right-hand side $c$ is the polynomial $f$,
  • the matrix coefficients $A_i$ are the weight polynomials times a suitable low rank matrix of the form $mm^{\sf T}$,
  • and we have one free variable $M$, with coefficient $1$.

Defining the polynomial basis and the samples

We will first define the vector $m$ of basis polynomials, and the samples. For simplicity we will use the monomial basis, and the samples defined by the rational points in the simplex with denominator $2d$.

    basis = basis_monomial(2d, x...)
    samples = sample_points_simplex(n, 2d; T=Rational{BigInt})

See Sampling for more explanation on sampling and unisolvence. In more complicated situations, we can improve the basis with approximatefekete. This orthogonalizes the basis with respect to the sample points, and if samples contains more samples than needed, this selects a good subset of the samples. We include this in the function using a keyword argument basis_change=true and

    if basis_change
        basis, samples = approximatefekete(basis, samples)
    end

This returns a basis of SampledMPolyRingElem's, which we can only evaluate at samples from samples. Common operations such as multiplications and additions work with these sampled polynomials, but operations such as extracting the degree is not possible since that requires expressing the polynomials in a graded basis. However, if the initial basis is ordered on degree, the final basis will have the same ordering, so we store the degrees using the code

    degrees = total_degree.(basis_monomial(2d, x...))

Defining the coefficients for the constraint

Now we are ready to define the low-rank matrix coefficients for the sum of squares polynomials. For each weight polynomial, we need to add the matrix coefficient corresponding to that weight to the dictionary of matrix coefficients. Since we want polynomials of degree 2d, we first need to select the part of the basis that we use in the sum-of-squares polynomials.

    psd_dict = Dict()
    for (i, w) in enumerate(ws)
        basispart = [basis[j] for j in eachindex(basis) if 2degrees[j] + total_degree(w) <= 2d]
        psd_dict[(:sos, i)] = LowRankMatPol([w], [basispart])
    end

The size of the matrices is implicitely given by the size of the LowRankMatPol, which is defined by the prefactors and the rank one terms.

Similarly, we can create the dictionary with the free variables using

    free_dict = Dict(:M => 1)

Defining the constraint

Now we can construct the Constraint with

    con = Constraint(f, psd_dict, free_dict, samples)

From the constraint, the dictionaries and coefficients can be retrieved using matrixcoeff(s) and freecoeffs.

Specifying non-polynomial constraints works similarly, in which case no samples should be supplied to the Constraint struct.

ClusteredLowRankSolver.ConstraintType
Constraint(constant, matrixcoeff, freecoeff[, samples, scalings])

Models a polynomial constaint of the form

\[ ∑_i ⟨ A_i(x), Y_i ⟩ + ∑_j b_j(x) y_j = c(x)\]

using sampling on samples. Here the samples are only required if $A_i$, $b_j$, and $c$ are polynomials. When the coefficient matrix $A_i$ has block structure with equal sized blocks, the Block struct can be used as key to indicate to which subblock the given matrix corresponds.

Arguments:

  • constant: The right hand side $c(x)$
  • matrixcoeff::Dict: The coefficient matrices for the positive semidefinite matrix variables.
  • freecoeff::Dict: The coefficients for the free variables.
  • samples::Vector: The sample set on which the constraint should be satisfied. Required for polynomial constraints.
  • scalings::Vector: Optional; scale the constraint with a factor depending on the sample index.
ClusteredLowRankSolver.LowRankMatPolType
LowRankMatPol(lambda::Vector, vs::Vector{Vector}[, ws::Vector{Vector}])

The matrix $∑_i λ_i v_i w_i^{\sf T}$ where $v_i$ are the entries of vs and $w_i$ the entries of ws

If ws is not specified, use ws = vs.

ClusteredLowRankSolver.approximatefeketeFunction
approximatefekete(basis, samples; base_ring=BigFloat) -> basis, samples

Compute approximate fekete points based on samples and a corresponding orthogonal basis. The basis consists of sampled polynomials, sampled at samples.

This preserves a degree ordering of basis if present.

Defining the problem

Now that we have the objective and the constraint, we can create the Problem with

    problem = Problem(Maximize(obj), [con])

Here the first argument gives the objective and the optimization sense (maximization or minimization). The second argument is a vector of the constraints defining the feasible region (in this example, only the constraint con). The objective and constraints can be retrieved from the problem using objective and constraints. Instead of first constructing all constraints and then defining the Problem, it is also possible to first define the problem using the objective, and then add the constraints with the function addconstraint!

ClusteredLowRankSolver.ProblemType
    Problem(maximize::Bool, objective::Objective, constraints::Vector{Constraint})
    Problem(obj::Union{Maximize, Minimize}, constraints::Vector)

Combine the objective and constraints into a low-rank polynomial problem.

Checking for obvious mistakes

Some basic checks on the problem and/or semidefinite program can be done using check_problem and check_sdp!. The check_problem function checks that

  • the sizes of the vectors in the low-rank constraint matrices are the same,
  • all constraints use at least one positive semidefinite matrix variable,
  • all variables in the objective are actually used in the constraints.

The check_sdp! function

  • checks that the constraint matrices are symmetric,
  • removes empty matrices and zero matrices.
    @assert check_problem(problem)
ClusteredLowRankSolver.check_sdp!Function
check_sdp!(sdp::ClusteredLowRankSDP)

Check whether the constraint matrices are symmetric, and remove empty constraint matrices.

Solving the problem

We can solve the Problem with the function solvesdp:

    status, primalsol, dualsol, time, errorcode = solvesdp(problem; prec=512, duality_gap_threshold=1e-60)

This function has multiple options including for example the number of bits used for solving the semidefinite program (prec) and how close the solution should be to optimality (duality_gap_threshold); see the page about the solver for more information.

Alternatively, it is possible to explicitely convert the problem into a semidefinite program using

    sdp = ClusteredLowRankSDP(problem)

which can also be solved with

    status, sol, time, errorcode = solvesdp(sdp)

This is for example useful when you wish to save the semidefinite program (e.g., using Serialization), or when you want to perform the extra checks provided by check_sdp!.

ClusteredLowRankSolver.solvesdpFunction
	solvesdp(problem::Problem; kwargs...)
    solvesdp(sdp::ClusteredLowRankSDP; kwargs...)

Solve the semidefinite program generated from problem or sdp.

Keyword arguments:

  • prec (default: precision(BigFloat)): the precision used
  • gamma (default: 0.9): the step length reduction; a maximum step length of α reduces to a step length of max(gamma*α,1)
  • beta_(in)feasible (default: 0.1 (0.3)): the amount mu is tried to be reduced by in each iteration, for (in)feasible solutions
  • omega_p/d (default: 10^10): the starting matrix variable for the primal/dual is omega_p/d*I
  • maxiterations (default: 500): the maximum number of iterations
  • duality_gap_threshold (default: 10^-15): how near to optimal the solution needs to be
  • primal/dual_error_threshold (default:10^-30): how feasible the primal/dual solution needs to be
  • max_complementary_gap (default: 10^100): the maximum of dot(X,Y)/nrows(X) allowed
  • need_primal_feasible/need_dual_feasible (default: false): terminate when the solution is primal/dual feasible
  • verbose (default: true): print information after every iteration if true
  • step_length_threshold (default: 10^-7): the minimum step length allowed
  • primalsol (default: nothing): start from the solution (primalsol, dualsol) if both primalsol and dualsol are given
  • dualsol (default: nothing): start from the solution (primalsol, dualsol) if both primalsol and dualsol are given
ClusteredLowRankSolver.ClusteredLowRankSDPType
ClusteredLowRankSDP(problem::Problem; prec=precision(BigFloat), verbose=false)

Define a ClusteredLowRankSDP based on problem.

Keyword arguments:

  • prec (default: precision(BigFloat)): the precision of the result
  • verbose (default: false): print progress to the standard output

Retrieving variables from the solution

The solver returns, among others, the primal and dual solutions as PrimalSolution{BigFloat} and DualSolution{BigFloat}. To retrieve the variables, it is possible to use

    matrixvar(dualsol, (:sos, 1))

and freevar(dualsol, :M) for free variables. Similarly, use matrixvars and freevars to iterate over all variables:

    for (k, m) in matrixvars(dualsol)
        # do stuff with the matrix m or the name k
    end 

To compute the objective for this solution, we can use

    objective = objvalue(problem, dualsol)

We might also want to return the problem and the solutions:

    return objective, problem, primalsol, dualsol
end
ClusteredLowRankSolver.DualSolutionType
DualSolution{T}

A dual solution to the semidefinite program, with fields

  • base_ring
  • matrixvars::Dict{Any, Matrix{T}}
  • freevars::Dict{Any, T}
ClusteredLowRankSolver.objvalueFunction
objvalue(objective::Objective, sol::DualSolution)
objvalue(problem::Problem, sol::DualSolution)

Return the objective value of the dual solution with respect to the given objective or problem.

Running the example

Now we can define the problem parameters and run the function

d = 2
R, (u,v,t,w) = polynomial_ring(QQ, 4)
f = -u*t^3 + 4v*t^2*w + 4u*t*w^2 + 2v*w^3 + 4u*t+ 4t^2 - 10v*w - 10w^2 +2
# the function for the weights
p(x) = 1//4 - x^2
ws = [R(1), p(u), p(v), p(t), p(w)]
# call the function
obj, problem, primalsol, dualsol = min_f(f, ws, d)
obj
-3.180096625844998335319568973990471892050902347714187740434770044414862333910399

Rounding the solution

Here we explain how we can heuristically extract an exact optimal solution from the numerical solution. This approach has been developed in the paper [1], and we refer to this paper for more information on the assumptions and the method. See Rounding for more information on how to use the rounding implementation.

Since we expect the solution to be relatively nice in terms of polynomials, we avoid doing the basis change generated by approximatefekete, and try to find the field using the following code.

obj, problem, primalsol, dualsol = min_f(f, ws, 2; basis_change=false)
N, gapprox = find_field(primalsol, dualsol)
N
Number field with defining polynomial -3*x^2 + 40*x + 20
  over rational field

We find a field of degree 2. Trying to round the objective to this field gives

to_field(obj, N, gapprox)
115//18*z - 7//72

Since this is a small expression, this gives some indication that N is the correct field. Now we will try to round the numerical solution to this field. First we convert the problem to the field N. This can be done using the function generic_embedding.

problem = map(x->generic_embedding(x, gen(N), base_ring=N), problem)

In general it is advisable to use coefficient matching to define the semidefinite program for the rounding procedure. For this we need to build a monomial basis for each constraint.

R, x = polynomial_ring(N, 4)
monbasis = basis_monomial(4, x...)

Then we are ready to try the rounding procedure. This will not always succeed without tuning the settings, but in this example it does. See the Rounding page for information on the settings.

success, exactdualsol = exact_solution(problem, primalsol, dualsol;
        FF=N, g=gapprox, monomial_bases=[monbasis])
** Starting computation of basis transformations **
  Block (:sos, 2) of size 5 x 5
  Block (:sos, 3) of size 5 x 5
  Block (:sos, 5) of size 5 x 5
  Block (:sos, 4) of size 5 x 5
    Block (:sos, 4) has 3 kernel vectors. The maximum numerator and denominator are 1 and 1
    After reduction, the maximum number of the basis transformation matrix is 20
  Block (:sos, 1) of size 15 x 15
    Block (:sos, 1) has 4 kernel vectors. The maximum numerator and denominator are 10 and 4
    After reduction, the maximum number of the basis transformation matrix is 1245
** Finished computation of basis transformations (0.514630266s) **
** Transforming the problem and the solution ** (0.198117764s)
** Projection the solution into the affine space **
  Reducing the system from 115 columns to 115 columns
  Constructing the linear system... (0.076150931s)
  Computing an approximate solution in the extension field... (0.080550456s)
  Preprocessing to get an integer system... (0.014183408s)
  Finding the pivots of A using RREF mod p... (0.02300532 0.010886574 s)
  We did not find enough pivots (126 instead of 140)
  Solving the system of size 126 x 147 using the pseudoinverse... 0.019168369s
** Finished projection into affine space (0.44442366s) **
** Checking feasibility **
The slacks are satisfied (checked or ensured by solving the system)
Checking sdp constraints
 done (0.018376967)

If success is true, the solution exactdualsol is guaranteed to be feasible. The affine constraints have been checked in exact arithmetic, and positive semidefiniteness has been checked in ball arithmetic. We can check that we get the same objective as directly rounding the objective to the field N.

objvalue(problem, exactdualsol)
115//18*z - 7//72

The output of the rounding procedure to the terminal contains some information about the number of kernel vectors and the largest numbers in absolute value occuring in those kernel vectors, some progress messages, and some messages about the system that is solved.

ClusteredLowRankSolver.find_fieldFunction
find_field(primalsol, dualsol, max_degree=4; valbound=1e-15, errbound=1e-15, bits=100, max_coeff=1000)

Heuristically find a field over which the kernel can probably be defined.

Only consider values at least valbound in absolute value. Find minimal polynomials such that the chosen entries are approximately generators with an error bound of errbound. Use bits number of bits and reject minimal polynomials with a maximum coefficient of more than max_coeff.

ClusteredLowRankSolver.exact_solutionFunction
exact_solution(problem::Problem, primalsol::PrimalSolution, dualsol::DualSolution; transformed=false, FF = QQ, g=1, settings::RoundingSettings=RoundingSettings(), monomial_bases=nothing)

Compute and return an exact solution to the problem, given a primal solution, dual solution and a field FF with approximate generator g. Return (success, exactdualsol) if transformed=false, and (success, pd_transformed_exactsolution, transformations) if transformed=true.

ClusteredLowRankSolver.generic_embeddingFunction
generic_embedding(exactvalue, g; base_ring=BigFloat)

Convert the exact numbers from a number field to floating point approximations, using a floating point approximation of a generator g of the field.

Convert rationals and integers to the same numbers in base_ring.

ClusteredLowRankSolver.to_fieldFunction
to_field(v, N, g; bits=100, errbound=1e-15)

Find an approximation of v in the number field N, using the approximate generator g of N.