# Introduction to Catalyst and Julia for New Julia users

The Catalyst tool for the modelling of chemical reaction networks is based in the Julia programming language. While experience in Julia programming is advantageous for using Catalyst, it is not necessary for accessing some of its basic features. This tutorial serves as an introduction to Catalyst for those unfamiliar with Julia, also introducing some basic Julia concepts. Anyone who plans on using Catalyst extensively is recommended to familiarise oneself more thoroughly with the Julia programming language. A collection of resources for learning Julia can be found here, and a full documentation is available here.

Julia can be downloaded here.

*Users who are already familiar with Julia can skip to the Introduction to Catalyst tutorial.*

## Basic Julia usage

On the surface, Julia has many similarities to languages like MATLAB, Python, and R.

*Values* can be assigned to *variables* through the use of a `=`

sign. Values (possibly stored in variables) can be used for most basic computations.

```
length = 2.0
width = 4.0
area = length*width
```

`8.0`

*Functions* take one or more inputs (enclosed by `()`

) and return some output. E.g. the `min`

function returns the minimum of two values

`min(1.0, 3.0)`

`1.0`

A line of Julia code is not required to end with `;`

, however, if it does, the output of that line is not displayed.

`min(1.0, 3.0);`

Each Julia variable has a specific *type*, designating what type of value it is. While not directly required to use Catalyst, this is useful to be aware of. To learn the type of a specific variable, use the `typeof`

function. More information about types can be found here.

`typeof(1.0)`

`Float64`

Here, `Float64`

denotes decimal-valued numbers. Integer-valued numbers instead are of the `Int64`

type.

`typeof(1)`

`Int64`

Finally, we note that the first time some code is run in Julia, it has to be *compiled*. However, this is only required once per Julia session. Hence, the second time the same code is run, it runs much faster. E.g. try running this line of code first one time, and then one additional time. You will note that the second run is much faster.

`rand(100, 100)^3.5;`

```
100×100 Matrix{ComplexF64}:
8421.21-0.00537535im 7073.34+0.0137052im … 7784.07+0.0365634im
9244.55-0.00145854im 7765.04+0.00371819im 8550.42+0.00991687im
8700.99-0.00314425im 7309.28+0.00801346im 8050.43+0.0213625im
9024.09+0.00492705im 7576.51-0.0125549im 8339.45-0.0334581im
8968.59-0.00105853im 7534.03+0.00269874im 8299.17+0.00719931im
8709.33+0.00131373im 7314.98-0.00334946im … 8048.94-0.00893554im
8944.35-0.00855466im 7523.63+0.0218041im 8277.94+0.0581341im
8660.65+0.0012359im 7284.76-0.00314839im 8012.04-0.00838569im
8311.15+0.00718863im 6997.81-0.0183253im 7691.81-0.0488742im
9134.31+0.00222324im 7670.76-0.00566934im 8442.11-0.0151296im
⋮ ⋱
9057.59-6.20057e-5im 7621.23+0.000155366im 8377.66+0.000400678im
8807.85+0.00358462im 7398.78-0.0091363im 8140.62-0.0243582im
9684.34+0.00186518im 8139.03-0.00475464im 8959.45-0.0126802im
9070.44+0.00367478im 7617.82-0.00936383im 8383.85-0.0249534im
8654.18-0.00254924im 7273.38+0.00649814im … 8011.01+0.0173286im
9219.43-5.89495e-5im 7742.64+0.000149425im 8520.61+0.000394214im
9312.43+0.00746522im 7821.27-0.0190278im 8604.31-0.0507347im
8833.06-0.00138809im 7431.75+0.00353678im 8168.2+0.00942383im
9178.5-0.00501376im 7711.46+0.0127758im 8485.11+0.0340467im
```

(This code creates a random 100x100 matrix, and takes it to the power of 3.5)

This is useful to know when you e.g. declare, simulate, or plot, a Catalyst model. The first time you run a command there might be a slight delay. However, subsequent runs will execute much quicker. This holds even if you do minor adjustments before the second run (such as changing simulation initial conditions).

## Installing and activating packages

Except for some base Julia packages (such as Pkg, the package manager) that are available by default, Julia packages must be installed locally before they can be used. Most packages are registered with Julia, and can be added through the `Pkg.add("DesiredPackage")`

command (where `DesiredPackage`

is the name of the package you wish to install). We can thus install Catalyst:

```
using Pkg
Pkg.add("Catalyst")
```

Here, the command `using Pkg`

is required to activate the Pkg package manager.

Next, we also wish to add the `DifferentialEquations`

and `Plots`

packages (for numeric simulation of models, and plotting, respectively).

```
Pkg.add("DifferentialEquations")
Pkg.add("Plots")
```

Once a package has been installed through the `Pkg.add`

command, this command does not have to be repeated in further Julia sessions on the same machine.

Installing a Julia package is, however, not enough to use it. Before a package's features are used in a Julia session, it has to be loaded through the `using DesiredPackage`

command (where `DesiredPackage`

is the name of the package you wish to activate). This command has to be repeated whenever a Julia session is restarted.

We thus activate our three desired packages:

```
using Catalyst
using DifferentialEquations
using Plots
```

For a more detailed introduction to Julia packages, please read the Pkg documentation.

## Simulating a basic Catalyst model

Now that we have some basic familiarity with Julia, and have installed and activated the required packages, we will create and simulate a basic chemical reaction network model through Catalyst.

Catalyst models are created through the `@reaction_network`

*macro*. For more information on macros, please read the Julia documentation on macros. This documentation is, however, rather advanced (and not required to use Catalyst). We instead recommend that you simply familiarise yourself with the Catalyst syntax, without studying in detail how macros work and what they are.

The `@reaction_network`

command is followed by the `begin`

keyword, which is followed by one line for each *reaction* of the model. Each reaction consists of a *reaction rate*, followed by the reaction itself. The reaction itself contains a set of *substrates* and a set of *products* (what is consumed and produced by the reaction, respectively). These are separated by a `-->`

arrow. Finally, the model ends with the `end`

keyword.

Here, we create a simple *birth-death* model, where a single species (*X*) is created at rate *b*, and degraded at rate *d*. The model is stored in the variable `rn`

.

```
rn = @reaction_network begin
b, 0 --> X
d, X --> 0
end
```

\[ \begin{align*} \varnothing &\xrightleftharpoons[d]{b} \mathrm{X} \end{align*} \]

For more information on how to use the Catalyst model creator (also known as the Catalyst DSL), please read the corresponding documentation.

Next, we wish to simulate our model. To do this, we need to provide some additional information to the simulator. This is

- The initial condition. That is, the concentration or numbers of each species at the start of the simulation.
- The timespan. That is, the timeframe over which we wish to run the simulation.
- The parameter values. That is, the values of the model's parameters for this simulation.

The initial condition is given as a *Vector*. This is a type which collects several different values. To declare a vector, the values are specific within brackets, `[]`

, and separated by `,`

. Since we only have one species, the vector holds a single element. In this element, we set the value of *X* using the `:X => 1.0`

syntax. Here, we first denote the name of the species (with a `:`

pre-appended), next follows a `=>`

and then the value of *X*. Since we wish to simulate the *concentration* of X over time, we will let the initial condition be decimal valued.

`u0 = [:X => 1.0]`

```
1-element Vector{Pair{Symbol, Float64}}:
:X => 1.0
```

The timespan sets the time point at which we start the simulation (typically `0.0`

is used) and the final time point of the simulation. These are combined into a two-valued *Tuple*. Tuples are similar to vectors, but are enclosed by `()`

and not `[]`

. Again, we will let both time points be decimal valued.

`tspan = (0.0, 10.0)`

`(0.0, 10.0)`

Finally, the parameter values are, like the initial conditions, given as a vector. Since we have two parameters (*b* and *d*), the parameter vector has two values. We use a similar notation for setting the parameter values as the initial condition (first the parameter, then an arrow, then the value).

`params = [:b => 1.0, :d => 0.2]`

```
2-element Vector{Pair{Symbol, Float64}}:
:b => 1.0
:d => 0.2
```

Please read here for more information on Vectors and Tuples.

Next, before we can simulate our model, we bundle all the required information together in a so-called `ODEProblem`

. Note that the order in which the input (the model, the initial condition, the timespan, and the parameter values) is provided to the ODEProblem matters. E.g. the parameter values cannot be provided as the first argument, but have to be the fourth argument. Here, we save our `ODEProblem`

in the `oprob`

variable.

`oprob = ODEProblem(rn, u0, tspan, params)`

```
ODEProblem with uType Vector{Float64} and tType Float64. In-place: true
timespan: (0.0, 10.0)
u0: 1-element Vector{Float64}:
1.0
```

We can now simulate our model. We do this by providing the `ODEProblem`

to the `solve`

function. We save the output to the `sol`

variable.

`sol = solve(oprob)`

Finally, we can plot the solution through the `plot`

function.

`plot(sol)`

Here, the plot shows the time evolution of the concentration of the species *X* from its initial condition.

For more information about the numerical simulation package, please see the DifferentialEquations documentation. For more information about the plotting package, please see the Plots documentation.

## Additional modelling example

To make this introduction more comprehensive, we here provide another example, using a more complicated model. In addition, instead of simulating our model as concentrations evolve over time, we will simulate the individual reaction events through the Gillespie algorithm. This is a way to add *noise* to our model.

Remember, unless we have restarted Julia, we do not need to activate our packages (through the `using`

command) again.

This time, we will declare the so-called SIR model for an infectious disease. Note that even if this model does not describe a set of chemical reactions, it can be modelled using the same dynamics. The model consists of 3 species:

*S*, the amount of*susceptible*individuals.*I*, the amount of*infected*individuals.*R*, the amount of*recovered*(or*removed*) individuals.

It also has 2 reaction events:

- Infection, where a susceptible individual meets an infected individual and also becomes infected.
- Recovery, where an infected individual recovers.

Each reaction is also associated with a specific rate (corresponding to a parameter).

*b*, the infection rate.*k*, the recovery rate.

We declare the model using the `@reaction_network`

macro, and store it in the `sir_model`

variable.

```
sir_model = @reaction_network begin
b, S + I --> 2I
k, I --> R
end
```

\[ \begin{align*} \mathrm{S} + \mathrm{I} &\xrightarrow{b} 2 \mathrm{I} \\ \mathrm{I} &\xrightarrow{k} \mathrm{R} \end{align*} \]

Note that the first reaction contains two different substrates (separated by a `+`

sign). While there is only a single product (*I*), two copies of *I* are produced. The *2* in front of the product *I* denotes this.

Next, we declare our initial condition, time span, and parameter values. Since we want to simulate the individual reaction events, that discretely change the state of our model, we want our initial conditions to be integer-valued. We will start with a mostly susceptible population, but where a single individual has been infected through some means.

```
u0 = [:S => 50, :I => 1, :R => 0.0]
tspan = (0.0, 10.0)
params = [:b => 0.2, :k => 1.0]
```

```
2-element Vector{Pair{Symbol, Float64}}:
:b => 0.2
:k => 1.0
```

Previously we have bundled this information into an `ODEProblem`

(denoting a deterministic *ordinary differential equation*). Now we wish to simulate our model as a jump process (where each reaction event denotes a single jump in the state of the system). We do this by first creating a `DiscreteProblem`

, and then using this as an input to a `JumpProblem`

.

```
dprob = DiscreteProblem(sir_model, u0, tspan, params)
jprob = JumpProblem(sir_model, dprob, Direct())
```

```
JumpProblem with problem DiscreteProblem with aggregator JumpProcesses.Direct
Number of jumps with discrete aggregation: 0
Number of jumps with continuous aggregation: 0
Number of mass action jumps: 2
```

Again, the order in which the inputs are given to the `DiscreteProblem`

and the `JumpProblem`

is important. The last argument to the `JumpProblem`

(`Direct()`

) denotes which simulation method we wish to use. For now, we recommend the user simply use the `Direct()`

option, and then consider alternative ones (see the JumpProcesses.jl docs) when they are more familiar with modelling in Catalyst and Julia.

Finally, we can simulate our model using the `solve`

function, and plot the solution using the `plot`

function. Here, the `solve`

function also has a second argument (`SSAStepper()`

). This is a time stepping algorithm that calls the `Direct`

solver to advance a simulation. Again, we recommend at this stage you simply use this option, and then explore exactly what this means at a later stage.

```
sol = solve(jprob, SSAStepper())
plot(sol)
```

**Exercise:** Try simulating the model several times. Note that the epidemic doesn't always take off, but sometimes dies out without spreading through the population. Try changing the infection rate (*b*), determining how this value affects the probability that the epidemic goes through the population.

## Feedback

If you are a new Julia user who has used this tutorial, and there was something you struggled with or would have liked to have explained better, please raise an issue. That way, we can continue improving this tutorial.

## References

- 1Jeff Bezanson, Alan Edelman, Stefan Karpinski, Viral B. Shah,
*Julia: A Fresh Approach to Numerical Computing*, SIAM Review (2017). - 2Torkel E. Loman, Yingbo Ma, Vasily Ilin, Shashi Gowda, Niklas Korsbo, Nikhil Yewale, Chris Rackauckas, Samuel A. Isaacson,
*Catalyst: Fast and flexible modeling of reaction networks*, PLOS Computational Biology (2023).