Tutorial
In this tutorial you will learn
- to build a
Problem
, which includes- creating the objective
- creating a constraint
- combining them into a
Problem
- to solve a
Problem
- to find an exact optimal solution
For explanation on more intricate parts of the interface, see Advanced modeling; for additional examples see the Examples section.
Running example
We will explain the interface using an example from polynomial optimization. Suppose we want to find the global minimum (or a lower bound on this) of the polynomial
\[f(u,v,t,w) = -ut^3+4vt^2w + 4 utw^2 + 2vw^3 + 4 ut+ 4t^2 -10 vw - 10 w^2 +2\]
in the domain $[-1/2, 1/2]^4$. We can relax the problem using a sum-of-squares characterization:
\[\begin{aligned} \text{maximize} \quad & M & \\ \text{subject to} \quad & f-M & = s_0 + \sum_i w_i s_i. \\ \end{aligned}\]
where the $w_i$ are polynomial weights describing the domain and $s_i$ are sum-of-squares polynomials. In this example we use
\[\begin{aligned} [-1/2, 1/2]^4 = \{(u,v,t,w): p(u), p(v), p(t), p(w) \geq 0\} \end{aligned}\]
with $p(x) = 1/4 - x^2$. Given a vector $m$ whose entries form a basis of the space of polynomials up to degree $d$, we can parametrize a sum-of-squares polynomial $s$ of degree $2d$ by a positive semidefinite matrix $Y$ with
\[ s = ⟨ Y, mm^{\sf T} ⟩.\]
The problem parameters are now
- the polynomial $f$
- the weight polynomials $w_i$
- the degree of the relaxation $d$
We will start with building a function that takes these parameters, and constructs the Problem
. Since the user will give the polynomial f
and the weights, we need to extract the polynomial ring and the polynomial variables.
using ClusteredLowRankSolver, Nemo
function min_f(f, ws, d; basis_change=true)
# extract the polynomial ring
R = parent(f)
x = gens(R)
n = nvars(R)
Defining the objective
First we will define the Objective
. This consists of a constant offset, a dictionary with the coefficient matrices for the positive semidefinite matrix variables appearing in the objective, and a dictionary with the scalar coefficients for the free variables used in the objective. In this example,
- the constant offset is
0
, - the dictionary of matrix coefficients is empty because we do not use the matrix variables in the objective,
- and the dictionary of scalar coefficients has one entry corresponding to $M$
This gives the first part of our function:
# Define the objective
obj = Objective(0, Dict(), Dict(:M => 1))
ClusteredLowRankSolver.Objective
— TypeObjective(constant, matrixcoeff::Dict, freecoeff::Dict)
The objective for the Problem.
Arguments:
constant
: A constant offset of the objective value.matrixcoeff
: ADict
with positive semidefinite matrix variable names as keys and the objective matrices as values.freecoeff
: ADict
with free variable names as keys and the coefficients as values.
Defining the constraints
Now we will define the constraints. In ClusteredLowRankSolver
a Constraint
is of the form
\[ \sum_i \langle A_i, Y_i \rangle + \sum_j b_j y_j = c. \]
To define it, we need the right-hand side $c$, the matrix coefficients $A_i$ for the positive semidefinite matrix variables, and the coefficients $b_j$ for the free variables. Here the scalars $c$ and $b_j$ and the entries of $A_i$ can either be constants or polynomials. When these are polynomials, we also need a unisolvent set of samples. In this example,
- the right-hand side $c$ is the polynomial $f$,
- the matrix coefficients $A_i$ are the weight polynomials times a suitable low rank matrix of the form $mm^{\sf T}$,
- and we have one free variable $M$, with coefficient $1$.
Defining the polynomial basis and the samples
We will first define the vector $m$ of basis polynomials, and the samples. For simplicity we will use the monomial basis, and the samples defined by the rational points in the simplex with denominator $2d$.
basis = basis_monomial(2d, x...)
samples = sample_points_simplex(n, 2d; T=Rational{BigInt})
See Sampling for more explanation on sampling and unisolvence. In more complicated situations, we can improve the basis with approximatefekete
. This orthogonalizes the basis with respect to the sample points, and if samples
contains more samples than needed, this selects a good subset of the samples. We include this in the function using a keyword argument basis_change=true
and
if basis_change
basis, samples = approximatefekete(basis, samples)
end
This returns a basis of SampledMPolyRingElem
's, which we can only evaluate at samples from samples
. Common operations such as multiplications and additions work with these sampled polynomials, but operations such as extracting the degree is not possible since that requires expressing the polynomials in a graded basis. However, if the initial basis
is ordered on degree, the final basis
will have the same ordering, so we store the degrees using the code
degrees = total_degree.(basis_monomial(2d, x...))
Defining the coefficients for the constraint
Now we are ready to define the low-rank matrix coefficients for the sum of squares polynomials. For each weight polynomial, we need to add the matrix coefficient corresponding to that weight to the dictionary of matrix coefficients. Since we want polynomials of degree 2d
, we first need to select the part of the basis that we use in the sum-of-squares polynomials.
psd_dict = Dict()
for (i, w) in enumerate(ws)
basispart = [basis[j] for j in eachindex(basis) if 2degrees[j] + total_degree(w) <= 2d]
psd_dict[(:sos, i)] = LowRankMatPol([w], [basispart])
end
The size of the matrices is implicitely given by the size of the LowRankMatPol
, which is defined by the prefactors and the rank one terms.
Similarly, we can create the dictionary with the free variables using
free_dict = Dict(:M => 1)
Defining the constraint
Now we can construct the Constraint
with
con = Constraint(f, psd_dict, free_dict, samples)
From the constraint, the dictionaries and coefficients can be retrieved using matrixcoeff
(s) and freecoeff
s.
Specifying non-polynomial constraints works similarly, in which case no samples should be supplied to the Constraint
struct.
ClusteredLowRankSolver.Constraint
— TypeConstraint(constant, matrixcoeff, freecoeff[, samples, scalings])
Models a polynomial constaint of the form
\[ ∑_i ⟨ A_i(x), Y_i ⟩ + ∑_j b_j(x) y_j = c(x)\]
using sampling on samples
. Here the samples are only required if $A_i$, $b_j$, and $c$ are polynomials. When the coefficient matrix $A_i$ has block structure with equal sized blocks, the Block
struct can be used as key to indicate to which subblock the given matrix corresponds.
Arguments:
constant
: The right hand side $c(x)$matrixcoeff::Dict
: The coefficient matrices for the positive semidefinite matrix variables.freecoeff::Dict
: The coefficients for the free variables.samples::Vector
: The sample set on which the constraint should be satisfied. Required for polynomial constraints.scalings::Vector
: Optional; scale the constraint with a factor depending on the sample index.
ClusteredLowRankSolver.LowRankMatPol
— TypeLowRankMatPol(lambda::Vector, vs::Vector{Vector}[, ws::Vector{Vector}])
The matrix $∑_i λ_i v_i w_i^{\sf T}$ where $v_i$ are the entries of vs
and $w_i$ the entries of ws
If ws
is not specified, use ws = vs
.
ClusteredLowRankSolver.approximatefekete
— Functionapproximatefekete(basis, samples; base_ring=BigFloat) -> basis, samples
Compute approximate fekete points based on samples and a corresponding orthogonal basis. The basis consists of sampled polynomials, sampled at samples
.
This preserves a degree ordering of basis
if present.
ClusteredLowRankSolver.matrixcoeff
— Functionmatrixcoeff(x::Union{Constraint, Objective}, name)
Return the matrix coefficient corresponding to name
ClusteredLowRankSolver.matrixcoeffs
— Functionmatrixcoeffs(x::Union{Constraint, Objective})
Return the dictionary of matrix coefficients
ClusteredLowRankSolver.freecoeff
— Functionfreecoeff(x::Union{Constraint, Objective}, name)
Return the coefficient of the free variable corresponding to name
ClusteredLowRankSolver.freecoeffs
— Functionfreecoeffs(x::Union{Constraint, Objective})
Return the dictionary of coefficients for the free variables
Defining the problem
Now that we have the objective and the constraint, we can create the Problem
with
problem = Problem(Maximize(obj), [con])
Here the first argument gives the objective and the optimization sense (maximization or minimization). The second argument is a vector of the constraints defining the feasible region (in this example, only the constraint con
). The objective and constraints can be retrieved from the problem using objective
and constraints
. Instead of first constructing all constraints and then defining the Problem
, it is also possible to first define the problem using the objective, and then add the constraints with the function addconstraint!
ClusteredLowRankSolver.Problem
— Type Problem(maximize::Bool, objective::Objective, constraints::Vector{Constraint})
Problem(obj::Union{Maximize, Minimize}, constraints::Vector)
Combine the objective and constraints into a low-rank polynomial problem.
ClusteredLowRankSolver.Maximize
— TypeMaximize(obj)
Maximize the objective
ClusteredLowRankSolver.Minimize
— TypeMinimize(obj)
Minimize the objective
ClusteredLowRankSolver.objective
— Functionobjective(x::Union{Maximize, Minimize, Problem})
Return the objective.
ClusteredLowRankSolver.constraints
— Functionconstraints(problem::Problem)
Return the constraints of problem
.
ClusteredLowRankSolver.addconstraint!
— Functionaddconstraint!(problem, constraint)
Add constraint
to the constraints of problem
.
Checking for obvious mistakes
Some basic checks on the problem and/or semidefinite program can be done using check_problem
and check_sdp!
. The check_problem
function checks that
- the sizes of the vectors in the low-rank constraint matrices are the same,
- all constraints use at least one positive semidefinite matrix variable,
- all variables in the objective are actually used in the constraints.
The check_sdp!
function
- checks that the constraint matrices are symmetric,
- removes empty matrices and zero matrices.
@assert check_problem(problem)
ClusteredLowRankSolver.check_problem
— Functioncheck_problem(problem::LowRankPolProblem)
Check for obvious mistakes in the constraints and objective
ClusteredLowRankSolver.check_sdp!
— Functioncheck_sdp!(sdp::ClusteredLowRankSDP)
Check whether the constraint matrices are symmetric, and remove empty constraint matrices.
Solving the problem
We can solve the Problem
with the function solvesdp
:
status, primalsol, dualsol, time, errorcode = solvesdp(problem; prec=512, duality_gap_threshold=1e-60)
This function has multiple options including for example the number of bits used for solving the semidefinite program (prec
) and how close the solution should be to optimality (duality_gap_threshold
); see the page about the solver for more information.
Alternatively, it is possible to explicitely convert the problem into a semidefinite program using
sdp = ClusteredLowRankSDP(problem)
which can also be solved with
status, sol, time, errorcode = solvesdp(sdp)
This is for example useful when you wish to save the semidefinite program (e.g., using Serialization
), or when you want to perform the extra checks provided by check_sdp!
.
ClusteredLowRankSolver.solvesdp
— Function solvesdp(problem::Problem; kwargs...)
solvesdp(sdp::ClusteredLowRankSDP; kwargs...)
Solve the semidefinite program generated from problem
or sdp
.
Keyword arguments:
prec
(default:precision(BigFloat)
): the precision usedgamma
(default:0.9
): the step length reduction; a maximum step length of α reduces to a step length ofmax(gamma*α,1)
beta_(in)feasible
(default:0.1
(0.3
)): the amount mu is tried to be reduced by in each iteration, for (in)feasible solutionsomega_p/d
(default:10^10
): the starting matrix variable for the primal/dual isomega_p/d*I
maxiterations
(default:500
): the maximum number of iterationsduality_gap_threshold
(default:10^-15
): how near to optimal the solution needs to beprimal/dual_error_threshold
(default:10^-30
): how feasible the primal/dual solution needs to bemax_complementary_gap
(default:10^100
): the maximum ofdot(X,Y)/nrows(X)
allowedneed_primal_feasible/need_dual_feasible
(default:false
): terminate when the solution is primal/dual feasibleverbose
(default:true
): print information after every iteration if truestep_length_threshold
(default:10^-7
): the minimum step length allowedprimalsol
(default:nothing
): start from the solution(primalsol, dualsol)
if bothprimalsol
anddualsol
are givendualsol
(default:nothing
): start from the solution(primalsol, dualsol)
if bothprimalsol
anddualsol
are given
ClusteredLowRankSolver.ClusteredLowRankSDP
— TypeClusteredLowRankSDP(problem::Problem; prec=precision(BigFloat), verbose=false)
Define a ClusteredLowRankSDP
based on problem
.
Keyword arguments:
prec
(default:precision(BigFloat)
): the precision of the resultverbose
(default: false): print progress to the standard output
Retrieving variables from the solution
The solver returns, among others, the primal and dual solutions as PrimalSolution{BigFloat}
and DualSolution{BigFloat}
. To retrieve the variables, it is possible to use
matrixvar(dualsol, (:sos, 1))
and freevar(dualsol, :M)
for free variables. Similarly, use matrixvars
and freevars
to iterate over all variables:
for (k, m) in matrixvars(dualsol)
# do stuff with the matrix m or the name k
end
To compute the objective for this solution, we can use
objective = objvalue(problem, dualsol)
We might also want to return the problem and the solutions:
return objective, problem, primalsol, dualsol
end
ClusteredLowRankSolver.PrimalSolution
— TypePrimalSolution{T}
A primal solution to the semidefinite program, with fields
base_ring
x::Vector{T}
matrixvars::Dict{Any, Matrix{T}}
ClusteredLowRankSolver.DualSolution
— TypeDualSolution{T}
A dual solution to the semidefinite program, with fields
base_ring
matrixvars::Dict{Any, Matrix{T}}
freevars::Dict{Any, T}
ClusteredLowRankSolver.matrixvar
— Functionmatrixvar(sol, name)
Return the matrix variable corresponding to name
ClusteredLowRankSolver.matrixvars
— Functionmatrixvars(sol)
Return the dictionary of matrix variables
ClusteredLowRankSolver.freevar
— Functionfreevar(sol, name)
Return the free variable corresponding to name
ClusteredLowRankSolver.freevars
— Functionfreevars(sol)
Return the dictionary of the free variables
ClusteredLowRankSolver.objvalue
— Functionobjvalue(objective::Objective, sol::DualSolution)
objvalue(problem::Problem, sol::DualSolution)
Return the objective value of the dual solution with respect to the given objective or problem.
Running the example
Now we can define the problem parameters and run the function
d = 2
R, (u,v,t,w) = polynomial_ring(QQ, 4)
f = -u*t^3 + 4v*t^2*w + 4u*t*w^2 + 2v*w^3 + 4u*t+ 4t^2 - 10v*w - 10w^2 +2
# the function for the weights
p(x) = 1//4 - x^2
ws = [R(1), p(u), p(v), p(t), p(w)]
# call the function
obj, problem, primalsol, dualsol = min_f(f, ws, d)
obj
-3.180096625844998335319568973990471892050902347714187740434770044414862333910399
Rounding the solution
Here we explain how we can heuristically extract an exact optimal solution from the numerical solution. This approach has been developed in the paper [1], and we refer to this paper for more information on the assumptions and the method. See Rounding for more information on how to use the rounding implementation.
Since we expect the solution to be relatively nice in terms of polynomials, we avoid doing the basis change generated by approximatefekete
, and try to find the field using the following code.
obj, problem, primalsol, dualsol = min_f(f, ws, 2; basis_change=false)
N, gapprox = find_field(primalsol, dualsol)
N
Number field with defining polynomial -3*x^2 + 40*x + 20
over rational field
We find a field of degree 2. Trying to round the objective to this field gives
to_field(obj, N, gapprox)
115//18*z - 7//72
Since this is a small expression, this gives some indication that N
is the correct field. Now we will try to round the numerical solution to this field. First we convert the problem to the field N
. This can be done using the function generic_embedding
.
problem = map(x->generic_embedding(x, gen(N), base_ring=N), problem)
In general it is advisable to use coefficient matching to define the semidefinite program for the rounding procedure. For this we need to build a monomial basis for each constraint.
R, x = polynomial_ring(N, 4)
monbasis = basis_monomial(4, x...)
Then we are ready to try the rounding procedure. This will not always succeed without tuning the settings, but in this example it does. See the Rounding page for information on the settings.
success, exactdualsol = exact_solution(problem, primalsol, dualsol;
FF=N, g=gapprox, monomial_bases=[monbasis])
** Starting computation of basis transformations **
Block (:sos, 2) of size 5 x 5
Block (:sos, 3) of size 5 x 5
Block (:sos, 5) of size 5 x 5
Block (:sos, 4) of size 5 x 5
Block (:sos, 4) has 3 kernel vectors. The maximum numerator and denominator are 1 and 1
After reduction, the maximum number of the basis transformation matrix is 20
Block (:sos, 1) of size 15 x 15
Block (:sos, 1) has 4 kernel vectors. The maximum numerator and denominator are 10 and 4
After reduction, the maximum number of the basis transformation matrix is 1245
** Finished computation of basis transformations (0.514630266s) **
** Transforming the problem and the solution ** (0.198117764s)
** Projection the solution into the affine space **
Reducing the system from 115 columns to 115 columns
Constructing the linear system... (0.076150931s)
Computing an approximate solution in the extension field... (0.080550456s)
Preprocessing to get an integer system... (0.014183408s)
Finding the pivots of A using RREF mod p... (0.02300532 0.010886574 s)
We did not find enough pivots (126 instead of 140)
Solving the system of size 126 x 147 using the pseudoinverse... 0.019168369s
** Finished projection into affine space (0.44442366s) **
** Checking feasibility **
The slacks are satisfied (checked or ensured by solving the system)
Checking sdp constraints
done (0.018376967)
If success
is true
, the solution exactdualsol
is guaranteed to be feasible. The affine constraints have been checked in exact arithmetic, and positive semidefiniteness has been checked in ball arithmetic. We can check that we get the same objective as directly rounding the objective to the field N
.
objvalue(problem, exactdualsol)
115//18*z - 7//72
The output of the rounding procedure to the terminal contains some information about the number of kernel vectors and the largest numbers in absolute value occuring in those kernel vectors, some progress messages, and some messages about the system that is solved.
ClusteredLowRankSolver.find_field
— Functionfind_field(primalsol, dualsol, max_degree=4; valbound=1e-15, errbound=1e-15, bits=100, max_coeff=1000)
Heuristically find a field over which the kernel can probably be defined.
Only consider values at least valbound
in absolute value. Find minimal polynomials such that the chosen entries are approximately generators with an error bound of errbound
. Use bits
number of bits and reject minimal polynomials with a maximum coefficient of more than max_coeff
.
ClusteredLowRankSolver.exact_solution
— Functionexact_solution(problem::Problem, primalsol::PrimalSolution, dualsol::DualSolution; transformed=false, FF = QQ, g=1, settings::RoundingSettings=RoundingSettings(), monomial_bases=nothing)
Compute and return an exact solution to the problem, given a primal solution, dual solution and a field FF
with approximate generator g
. Return (success, exactdualsol)
if transformed=false
, and (success, pd_transformed_exactsolution, transformations)
if transformed=true
.
ClusteredLowRankSolver.generic_embedding
— Functiongeneric_embedding(exactvalue, g; base_ring=BigFloat)
Convert the exact numbers from a number field to floating point approximations, using a floating point approximation of a generator g of the field.
Convert rationals and integers to the same numbers in base_ring.
ClusteredLowRankSolver.to_field
— Functionto_field(v, N, g; bits=100, errbound=1e-15)
Find an approximation of v in the number field N, using the approximate generator g of N.