# Optimization Problem Formulation

First, we describe how to load provided or your own non-time-series dependent data as OptDataCEP. Second, we describe the data types within the OptDataCEP and how to access it.

## General

The capacity expansion problem (CEP) is designed as a linear optimization model. It is implemented in the algebraic modelling language JUMP. The implementation within JuMP allows to optimize multiple models in parallel and handle the steps from data input to result-analysis and diagram export in one open-source programming language. The coding of the model enables scalability based on the provided data input, single command based configuration of the setup model, result and configuration collection for further analysis and the opportunity to run design and operation in different optimizations.

The basic idea for the energy system is to have a spatial resolution of the energy system in discrete nodes. Each node has demand, non-dispatchable generation, dispatchable generation and storage capacities of varying technologies connected to itself. The different energy system nodes are interconnected with each other by transmission lines. The model is designed to minimize social costs by minimizing the following objective function:

$min \sum_{account,tech}COST_{account,'EUR/USD',tech} + \sum LL \cdot cost_{LL} + LE \cdot cos_{LE}$

## Sets

The model's scalability is relying on the usage of sets. The elements of the sets are extracted from the input data and scale the different variables. An overview of the sets is provided in the table. Depending on the model's configuration the necessary sets are initialized.

The sets are setup as a dictionary and organized as set[tech_name][tech_group]=[elements...], where:

• tech_name is the name of the dimension like e.g. tech, or node
• tech_group is the name of a group of elements within each dimension like e.g. ["all", "generation"]. The group "all" always contains all elements of the dimension
• [elements...] is the Array with the different elements like ["pv", "wind", "gas"]
namedescription
linestransmission lines connecting the nodes
nodesspacial energy system nodes
techgeneration, conversion, storage, and transmission technologies
carriercarrier that an energy balance is calculated for electricity, hydrogen...
impactimpact categories like EUR or USD, CO 2 − eq., ...
accountfixed costs for installation and yearly expenses, variable costs
infrastructinfrastructure status being either new or existing
time Knumeration of the representative periods
time T periodnumeration of the time intervals within a period
time T pointnumeration of the time points within a period
time I periodnumeration of the time intervals of the full input data periods
time I pointnumeration of the time points of the full input data periods
dir transmissiondirection of the flow uniform with or opposite to the direction of the line

## Variables

The variables can have different types:

• cv: cost variable - information of the costs
• dv: design variable - information of the energy system design
• ov: operation variable - information of the energy system operation
• sv: slack variable - information of unmet demands or exceeded emission limits

An overview of the variables used in the CEP is provided in the table:

nametypedimensionsunitdescription
COSTcv[account,impact,tech]EUR or USD, kg-LCA-categoriesCosts
CAPdv[tech,infrastruct,node]MWCapacity
GENov[tech,carrier,t,k,node]MWGeneration
SLACKsv[carrier,t,k,node]MWPower gap, not provided by installed CAP
LLsv[carrier]MWhLoastLoad Generation gap, not provided by installed CAP
LEsv[impact]LCA-categoriesLoastEmission Amount of emissions that installed CAP crosses the Emission constraint
INTRASTORov[tech,carrier,t,k,node]MWhStorage level within a period
INTERSTORov[tech,carrier,i,node]MWhStorage level between periods of the full time series
FLOWov[tech,carrier,dir,t,k,line]MWFlow over transmission line
TRANSov[tech,infrastruct,lines]MWmaximum capacity of transmission lines

## Mathematical formulation

Note

The mathematical formulation depends on the specific model configuration. The different configurations are introduced in Running the Capacity Expansion Problem. The specific equations, which are applied are tracked by the model itself and can be viewed as explained in Equations

We explain the equations used for of a simple optimization model with dispatchable generation, non-dispatchable generation, and a given demand:

\begin{aligned} \text{min }&\sum_{acc,tech}COST_{acc,imp_{money},tech} + \sum_{n} \left( LL_{n} \cdot c_{ll} \right) +\sum_{imp} \left(LE_{imp} \cdot c_{le,imp}\right)\\ \text{s.t. }&\\ COST_{acc, imp, tech} &= \sum_{t,k,n}GEN_{tech,car(tech),t,k,n}\cdot w_{k} \cdot \Delta t_{t,k} \cdot c_{acc,tech,imp} \forall\ acc \in \{var\}\\ COST_{acc,imp,tech} &= yf \cdot \sum_{n}CAP_{tech,'new', n} \cdot \left(c_{acc,tech,imp}\right)\quad \forall\ \ acc \in \{fix\},\\ yf &= \frac{\sum_{t,k}\Delta t_{t,k}\cdot w_{k}}{8760h}\\ CAP_{tech,'ex',n} &= existing{tech,n}\\ 0 &\leq GEN_{tech, car(tech), t, k, n} \leq \sum_{infr} CAP_{tech,infr,n} \quad\forall\ tech \in \mathbf{tech}_{disp}\\ 0 &\leq GEN_{tech, car(tech), t, k, n} \leq \sum_{infr} CAP_{tech,infr,n}*z_{tech,n,t,k} \quad\forall\ tech \in \mathbf{tech}_{nondisp}\\ GEN_{tech, car(tech), t, k, n} &= - \sum_{infr} CAP_{tech,infr,n} * z_{demand,n,t,k} \quad\forall\ tech \in \mathbf{tech}_{demand}\\ \sum_{acc,tech} COST_{acc,imp,tech} &\leq LE_{imp} + lim_{imp}\cdot\sum_{n,t,k}\left(w_{k}\cdot \Delta t_{t,k} \cdot z_{demand,n,t,k}\right) \forall imp \in \mathbf{imp}_{lca}\\ LL_{n} &= \sum_{t,k} \left( SLACK_{t,k,n}\cdot w_{k} \cdot \Delta t_{t,k}\right)\\ 0 &= \sum_{tech,n}GEN_{tech,t,k,n} + SLACK_{t,k,n}\\ \end{aligned}

The Objective Function minimizes total system costs, where COST is the cost of different technologies, LL is lost load, c_{ll} the variable costs for lost load, LE is lost emissions, and c_{le} is the variable costs for lost emissions. The variable costs are calculated, where GEN is the generation, \Delta t is the time step length and c_{acc,tech,imp} is the variable cost per electric energy. The fixed costs are calculated, where CAP is the installed capacity and yf is the year factor, calculating how many years are represented by the original time series. The generation is limited for dispatchable and non-dispatchable technologies by the installed capacities and an availability factor z for the non-dispatchable generation. The existing capacity is fixed to the provided input values. The demand is multiplied with the installed demand-capacity and fixed as a negative generation. The emissions are limited to the emission constraints, which can be exceeded by the lost emissions. The sum of generation and slack is fixed to zero. The slack is positive if the dispatchable and non-dispatchable generation can not meet the demand.

## Running the Capacity Expansion Problem

Note

The CEP model can be run with many configurations. The configurations themselves don't mess with each other through the provided input data must fulfil the ability to have, e.g. lines in order for transmission to work.

An overview is provided in the following table:

descriptionunitconfigurationvaluestypedefault value
enforce an emission-limitkg-impact/MWh-carrierlimit_emissionDict{String,Number}(impact/carrier=>value)::Dict{String,Number}Dict{String,Number}()
including existing infrastructure (no extra costs) and limit infrastructure-infrastructureDict{String,Array}("existing"=>[tech-groups...], "limit"=>[tech-groups...])::Dict{String,Array}Dict{String,Array}("existing"=>["demand"])
type of storage implementation-storage_type"none", "simple" or "seasonal"::String"none"
allowing conversion (necessary for storage)-conversiontrue or false::Boolfalse
allowing demand-demandtrue or false::Booltrue
allowing dispatchable generation-dispatchable_generationtrue or false::Booltrue
allowing non dispatchable generation-non_dispatchable_generationtrue or false::Booltrue
allowing transmission-transmissiontrue or false::Boolfalse
fix. installed capacities to dispatch problem-fixed_design_variablesdesign variables from design run or nothing::OptVariablesnothing
allowing lost load (necessary for dispatch)price/MWh-carrierlost_load_costDict{String,Number}(carrier=>value)::Dict{String,Number}Dict{String,Number}()
allowing lost emission (necessary for dispatch)price/kg-impactlost_emission_costDict{String,Number}(impact=>value)::Dict{String,Number}Dict{String,Number}()

They can be applied in the following way:

CapacityExpansion.run_optFunction
run_opt(ts_data::ClustData,opt_data::OptDataCEP,config::Dict{String,Any},optimizer::DataType)

Organizing the actual setup and run of the CEP-Problem. This function shouldn't be called by a user, but from within the other run_opt-functions Required elements are:

• ts_data: The time-series data.
• opt_data: In this case the OptDataCEP that contains information on costs, nodes, techs and for transmission also on lines.
• config: This includes all the settings for the design optimization problem formulation.
• optimizer: The used optimizer, which could e.g. be Clp: using Clpoptimizer=Clp.Optimizer or Gurobi: using Gurobioptimizer=Gurobi.Optimizer.
 run_opt(ts_data::ClustData,opt_data::OptDataCEP,config::Dict{String,Any},fixed_design_variables::Dict{String,Any},optimizer::DataTyple;lost_el_load_cost::Number=Inf,lost_CO2_emission_cost::Number)

This problem runs the operational optimization problem only, with fixed design variables. provide the fixed design variables and the config of the previous step (design run or another opterational run) Required elements are:

• ts_data: The time-series data, which should be be the original time-series data for this operational run. The keys(ts_data.data) need to match the [time_series_name]-[node]
• opt_data: In this case the OptDataCEP that contains information on costs, nodes, techs and for transmission also on lines. - Should be the same as in the design run.
• config: This includes all the previous settings for the design optimization problem formulation and ensures that the configuration is the same.
• fixed_design_variables: All the design variables that are determined by the previous design run.
• optimizer: The used optimizer, which could e.g. be Clp: using Clpoptimizer=Clp.Optimizer or Gurobi: using Gurobioptimizer=Gurobi.Optimizer.

What you can change in the config:

• lost_load_cost: Dictionary with numbers indicating the lost load price per carrier (e.g. electricity in price/MWh should be greater than 1e6), give Inf for no SLACK and LL (Lost Load - a variable for unmet demand by the installed capacities)
• lost_emission_cost: Dictionary with numbers indicating the emission price/kg-emission (Suggestion: around 700), give Inf for no LE (Lost Emissions - a variable for emissions that will exceed the limit in order to provide the demand with the installed capacities)
run_opt(ts_data::ClustData,
opt_data::OptDataCEP,
optimizer::DataType;
descriptor::String="",
storage_type::String="none",
demand::Bool=true,
dispatchable_generation::Bool=true,
non_dispatchable_generation::Bool=true,
conversion::Bool=false,
transmission::Bool=false,
lost_emission_cost::Dict{String,Number}=Dict{String,Number}(),
limit_emission::Dict{String,Number}=Dict{String,Number}(),
infrastructure::Dict{String,Array}=Dict{String,Array}("existing"=>["demand"],"limit"=>Array{String,1}()),
scale::Dict{Symbol,Int}=Dict{Symbol,Int}(:COST => 1e9, :CAP => 1e3, :GEN => 1e3, :SLACK => 1e3, :INTRASTOR => 1e3, :INTERSTOR => 1e6, :FLOW => 1e3, :TRANS =>1e3, :LL => 1e6, :LE => 1e9),
print_flag::Bool=true,
optimizer_config::Dict{Symbol,Any}=Dict{Symbol,Any}(),
round_sigdigits::Int=9)

Wrapper function for type of optimization problem for the CEP-Problem (NOTE: identifier is the type of opt_data - in this case OptDataCEP - so identification as CEP problem). Required elements are:

• ts_data: The time-series data, which could either be the original input data or some aggregated time-series data. The keys(ts_data.data) need to match the [time_series_name]-[node]
• opt_data: The OptDataCEP that contains information on costs, nodes, techs and for transmission also on lines.
• optimizer: The used optimizer, which could e.g. be Clp: using Clpoptimizer=Clp.Optimizer or Gurobi: using Gurobioptimizer=Gurobi.Optimizer.

Options to tweak the model are:

• descriptor: A name for the model
• storage_type: String "none" for no storage, "simple" to include simple (only intra-day storage), or "seasonal" to include seasonal storage (inter-day)
• demand: Bool true or false for technology-group
• dispatchable_generation: Bool true or false for technology-group
• non_dispatchable_generation: Bool true or false for technology-group
• conversion: Bool true or false for technology-group
• transmission:Bool true or false for technology-group. If no transmission should be modeled, a 'copperplate' is assumed with no transmission restrictions between the nodes
• limit: Dictionary with numbers limiting the kg.-emission-eq./MWh (e.g. CO2 normally in a range from 5-1250 kg-CO2-eq/MWh), give Inf or no kw if unlimited
• lost_load_cost: Dictionary with numbers indicating the lost load price per carrier (e.g. electricity in price/MWh should be greater than 1e6), give Inf for no SLACK and LL (Lost Load - a variable for unmet demand by the installed capacities). Example: lostloadcost=Dict{String,Number}("electricity"=>1e6)
• lost_emission_cost: Dictionary with numbers indicating the emission price/kg-emission (Suggestion: around 700), give Inf for no LE (Lost Emissions - a variable for emissions that will exceed the limit in order to provide the demand with the installed capacities). Example: lostemissioncost=Dict{String,Number}("CO2"=>700)
• infrastructure : Dictionary with Arrays indicating which technology groups should have existing infrastructure ("existing" => ["demand","dispatchable_generation"]) and which technology groups should have infrastructure limited ("limit" => ["non_dispatchable_generation"])
• scale: Dict{Symbol,Int} with a number for each variable (like :COST) to scale the variables and equations to similar quantities. Try to acchieve that the numerical model only has to solve numerical variables in a scale of 0.01 and 100. The following equation is used as a relationship between the real value, which is provided in the solution (real-VAR), and the numerical variable, which is used within the model formulation (VAR): real-VAR [EUR, MW or MWh] = scale[:VAR] ⋅ VAR.
• descriptor: String with the name of this paricular model like "kmeans-10-co2-500"
• print_flag: Bool to decide if a summary of the Optimization result should be printed.
• optimizer_config: Each Symbol and the corresponding value in the Dictionary is passed on to the with_optimizer function in addition to the optimizer. For Gurobi an example Dictionary could look like Dict{Symbol,Any}(:Method => 2, :OutputFlag => 0, :Threads => 2) more information can be found in the optimizer specific documentation.
• round_sigdigits: Can be used to round the values of the result to a certain number of sigdigits.

## Transmission

A CapacityExpansion model can be run with or without technology transmission.

Note

If the technology transmission is not modelled (transmission=false), the transmission between nodes is not restricted, which is equivalent to a copperplate assumption.

Note

Include transmission=true and infrastructure = Dict{String,Array}("existing"=>[...,"transmission"], "limit"=>[...,"transmission"]) to model existing transmission. This sets the existing transmission TRANS to the values defined in the lines.csv file in column power_ex, and limits the transmission by the values defined in lines.csv in the column power_lim. If no new transmission should be setup, use the same values for existing transmission(column power_ex) and the limit (column power_lim).

## Solver

The package provides no optimizer, and a solver has to be added separately. For the linear optimization problem suggestions are:

• Clp as an open-source solver
• Gurobi as a proprietary solver with free academic licenses. Gurobi is faster than Clp, and we prefer it in the academic setting.
• CPLEX as an alternative proprietary solver

Install the corresponding julia-package for the solver and call its optimizer like e.g.:

using Pkg
using Clp
optimizer=Clp.Optimizer

## Solver Configuration

Depending on the Solver, different solver configurations are possible. The information is always provided as Dict{Symbol,Any}. The keys of the dictionary are the parameters and the values of the dictionary are the values passed to the solver.

For example, the Gurobi solver can be configured to have no OutputFlag and run on two threads (per julia thread) the following way:

optimizer_config=Dict{Symbol,Any}(:OutputFlag => 0, :Threads => 2)

Further information on possible keys for Gurobi can be found at Gurobi parameter description.

## Scaling

The package features the scaling of variables and equations. Scaling variables, which are used in the numerical model, to 0.01 ≤ x ≤ 100 and scaling equations to 3⋅x = 1 instead of 3000⋅x = 1000 improves the shape of the optimization space and significantly reduces the computational time used to solve the numerical model.

The values are only scaled within the numerical model formulation, where we call the variable VAR, but the values are unscaled in the solution, which we call real-VAR. The following logic is used to scale the variables: real-VAR [EUR, USD, MW, or MWh] = scale[:VAR] ⋅ VAR0.01 ≤ VAR ≤ 100⇔ 0.01 ≤ real-VAR / scale[:VAR] ≤ 100

The equations are scaled with the scaling parameter of the first variable, which is scale[:COST] in the following example: scale[:COST]⋅COST = 10⋅scale[:CAP]⋅CAP⇔ COST = 10⋅(scale[:CAP]/scale[:COST])⋅CAP

### Change scaling parameters

Changing the scaling parameters is useful if the data you use represents a much smaller or bigger energy system than the ones representing Germany and California provided in this package Determine the right scaling parameters by checking the real-values of COST, CAP, GEN... (real-VAR) in a solution using your data. Select the scaling parameters to match the following: 0.01 ≤ real-VAR / scale[:VAR] ≤ 100 Create a dictionary with the new scaling parameters for EACH variable and include it as the optional scale input to overwrite the default scale in run_opt:

scale=Dict{Symbol,Int}(:COST => 1e9, :CAP => 1e3, :GEN => 1e3, :SLACK => 1e3, :INTRASTOR => 1e3, :INTERSTOR => 1e6, :FLOW => 1e3, :TRANS =>1e3, :LL => 1e6, :LE => 1e9)
scale_result = run_opt(ts_clust_data,cep_data,optimizer;scale=scale)

• Extend the default scale-dictionary in the src/optim_problems/run_opt-file to include the new variable as well.
• Include the new variable in the problem formulation in the src/optim_problems/opt_cep-file. Reformulate the equations by dividing them by the scaling parameter of the first variable, which is scale[:COST] in the following example:
• scale[:COST]⋅COST = 10⋅scale[:CAP]⋅CAP + 100
• ⇔ COST = 10⋅(scale[:CAP]/scale[:COST])⋅CAP + 100/scale[:COST]