# Samplers

The choice of sampler determines specific algorithms that are used to sample, update, and disable clocks. Helpers also exist that are useful for logging, utilizing common random numbers, and hierarchical sampling.

## Sampler Supertype

`CompetingClocks.SSA`

— Type`SSA{KeyType,TimeType}`

This abstract type represents a stochastic simulation algorithm (SSA). It is parametrized by the clock ID, or key, and the type used for the time, which is typically a Float64. The type of the key can be anything you would use as a dictionary key. This excludes mutable values but includes a wide range of identifiers useful for simulation. For instance, it could be a `String`

, but it could be a `Tuple{Int64,Int64,Int64}`

, so that it indexes into a complicated simulation state.

## Sampler Types

`CompetingClocks.FirstReaction`

— Type`FirstReaction{KeyType,TimeType}()`

This is the classic first reaction method for general distributions. Every time you sample, this goes to each distribution and asks when it would fire. Then it takes the soonest and throws out the rest of the sampled times until the next sample. It can also be very fast when there are only a few clocks to sample.

One interesting property of this sampler is that you can call `next()`

multiple times in order to get a distribution of next firing clocks and their times to fire.

`CompetingClocks.FirstToFire`

— Type`FirstToFire{KeyType,TimeType}()`

This sampler is often the fastest for non-exponential distributions. When a clock is first enabled, this sampler asks the clock when it would fire and saves that time in a sorted heap of future times. Then it works through the heap, one by one. When a clock is disabled, its future firing time is removed from the list. There is no memory of previous firing times.

`CompetingClocks.DirectCall`

— Type`DirectCall{KeyType,TimeType,TreeType}()`

DirectCall is responsible for sampling among Exponential distributions. It samples using the Direct method. In this case, there is no optimization to that Direct method, so we call it DirectCall because it recalculates everything every time you call it.

The algorithm for the Direct Method relies heavily on what data structure it uses to maintain a list of hazard rates, such that it can know the sum of those hazards and index into them using a random value. This struct has a default constructor that chooses a data structure for you, but there are several options.

**Example**

If we know that our simulation will only use a small number of different clock keys, then it would make sense to use a data structure that disables clocks by zeroing them out, instead of removing them from the list. This will greatly reduce memory churn. We can do that by changing the underlying data structure.

```
prefix_tree = BinaryTreePrefixSearch{T}()
keyed_prefix_tree = KeyedKeepPrefixSearch{K,typeof(prefix_tree)}(prefix_tree)
sampler_noremove = DirectCall{K,T,typeof(keyed_prefix_tree)}(keyed_prefix_tree)
```

`CompetingClocks.CombinedNextReaction`

— Type`CombinedNextReaction{KeyType,TimeType}()`

This combines Next Reaction Method and Modified Next Reaction Method. The Next Reaction Method is from Gibson and Bruck in their 2000 paper called "Efficient Exact Stochastic Simulation of Chemical Systems with Many Species and Many Channels". The Modified Next Reaction Method is from David F. Anderson's 2007 paper, "A modified Next Reaction Method for simulating chemical systems with time dependent propensities and delays". Both methods reuse draws of random numbers. The former works by accumulating survival of a distribution in a linear space and the latter works by accumulating survival of a distribution in a log space.

Each enabled clock specifies a univariate distribution from the `Distributions`

package. Every distribution is more precise being sampled in the manner of the Next Reaction method (linear space) or the manner of the Modified Next Reaction method (log space). This sampler chooses which space to use depending on the type of the `UnivariateDistribution`

and based on performance timings that are done during package testing. Defaults are set for those distributions included in the `Distributions.jl`

package. If you want to add a distribution, then define:

`sampling_space(::MyDistribution) = LogSampling`

If you want to override a choice in the library, then create a sub-type of the given distribution, and specify its sampling space.

```
struct LinearGamma <: Distributions.Gamma end
sampling_space(::LinearGamma) = LinearSampling
```

If you want to test a distribution, look at `tests/nrmetric.jl`

to see how distributions are timed.

## Sampling Helpers

`CompetingClocks.CommonRandomRecorder`

— TypeCommon random variates, also called common random numbers (CRN), are a technique for reducing variance when repeating runs of simulations with different parameters. The idea is to record the randomness of one simulation and replay the same choices in subsequent runs. This particular implementation does this by saving the state of the random number generator every time it's used by a sampler.

The Xoshiro sampler has a relatively small state (32 bytes), which is saved every time the sampler uses random numbers. This CRN recorder saves data in memory, but we could save that to a memory-mapped file so that the operating system will optimize transfer of that memory to disk.

What happens when replays of simulation runs use more draws than the first, recorded simulation? Those simulations draw from a fresh random number generator. This is not an exact approach.

**Example**

The goal is to run the simulation with ten different parameter sets and measure how much different parameters change the mean of some quantity determined by the trajectories.

```
using Random: Xoshiro
using CompetingClocks
example_clock = (3, 7) # We will use clock IDs that are a tuple of 2 integers.
sampler = FirstToFire{typeof(example_clock)}()
crn_sampler = CommonRandomRecorder(sampler, typeof(example_clock), Xoshiro)
for trial_idx in 1:100
run_simulation(model, crn_sampler)
reset!(crn_sampler)
end
for param_idx in 1:10
each_model = modify_model!(model, param_idx)
run_simulation(each_model, crn_sampler)
reset!(crn_sampler)
end
```

`CompetingClocks.freeze`

— Function`freeze(cr::CommonRandomRecorder)::FrozenCommonRandomRecorder`

The CommonRandomRecorder records every time it sees a clock request random number generation. It continues to do that every time it runs, which is a problem if you run simulations for comparison on multiple threads. If you want to use CRN and to use multiple threads for subsequent simulation runs, then first run the simulation a bunch of times on one thread. Then freeze the simulation, and then the frozen version will stop remembering new threads.

There is one part of the frozen recorder that will be mutable because it's useful for debugging, the record of missed clocks. Freeze a recorder for each thread, and each thread will track its own misses. They will all work from the same copy of the recorded random number generator states.

`CompetingClocks.misscount`

— Function`misscount(recorder::CommonRandomRecorder)`

The common random recorder watches a simulation and replays the states of the random number generator on subsequent runs. This counts the number of times during the most recent run that a clock event happened that could not be replayed.

`CompetingClocks.misses`

— Function`misses(recorder::CommonRandomRecorder)`

This iterates over pairs of misses in the common random recorder during the most recent simulation run, where the start of a simulation run was marked by calling `reset!`

.

`CompetingClocks.MultiSampler`

— Type`MultiSampler{SamplerKey,Key,Time}(which_sampler::Function)`

This makes a sampler that uses multiple stochastic sampling algorithms (SSA) to determine the next transition to fire. It returns the soonest transition of all of the algorithms. The `which_sampler`

function looks at the clock ID, or key, and chooses which sampler should sample this clock. Add algorithms to this sampler like you would add them to a dictionary.

Once a clock is first enabled, it will always go to the same sampler. This sampler remembers the associations, which could increase memory for simulations with semi-infinite clocks.

**Examples**

Let's make one sampler for exponential distributions, one for a few clocks we know will be fast and one for slower clocks. We can name them with symbols. The trick is that we need to direct each kind of distribution to the correct sampler. Use a Float64 for time and each clock can be identified with an Int64.

```
using CompetingClocks
using Distributions: Exponential, UnivariateDistribution
struct ByDistribution <: SamplerChoice{Int64,Symbol} end
function CompetingClocks.choose_sampler(
chooser::ByDistribution, clock::Int64, distribution::Exponential
)::Symbol
return :direct
end
function CompetingClocks.choose_sampler(
chooser::ByDistribution, clock::Int64, distribution::UnivariateDistribution
)::Symbol
if clock < 100
return :fast
else
return :slow
end
end
sampler = MultiSampler{Symbol,Int64,Float64}(ByDistribution())
sampler[:direct] = OptimizedDirect{Int64,Float64}()
sampler[:fast] = FirstToFire{Int64,Float64}()
sampler[:slow] = FirstToFire{Int64,Float64}()
```

`CompetingClocks.SingleSampler`

— Type`SingleSampler{SSA,Time}(propagator::SSA)`

This makes a sampler from a single stochastic simulation algorithm. It combines the core algorithm with the rest of the state of the system, which is just the time.

`CompetingClocks.ChatReaction`

— TypeThis sampler can help if it's the first time you're trying a model. It checks all of the things and uses Julia's logger to communicate them. It samples using the first reaction algorithm.

`CompetingClocks.DebugWatcher`

— Type`DebugWatcher()`

For debugging, it helps to have visibility into the simulation. This Watcher records everything that is enabled or disabled as a list of all enables and all disabled. It's the complete event history, and you can think of it as the filtration for the process going forward.

```
watcher = DebugWatcher{String}()
# enable and disable some things.
(watcher.enabled[1].clock,
watcher.enabled[1].distribution,
watcher.enabled[1].te,
watcher.enabled[1].when)
```

`CompetingClocks.TrackWatcher`

— Type`TrackWatcher{K,T}()`

This Watcher doesn't sample. It records everything enabled. You can iterate over enabled clocks with a for-loop. If we think of the model as providing changes in which transitions are enabled or disabled, this Watcher accumulates those changes to provide a consistent list of all enabled transitions. Together, a model and this Watcher provide the Semi-Markov core matrix, or the row of it that is currently known.

```
for entry in tracker
entry.clock
entry.distribution
entry.te
entry.when
end
```

`CompetingClocks.consume_survival`

— FunctionThis updates the survival for a transition in the linear space, according to Gibson and Bruck. Transition was enabled between time record t*0 and t*n. Divide the survival by the conditional survival between t*0 and t*n. t*e can be before t*0, at t*0, between t*0 and t*n, or at t*n, or after t_n.

$u=\exp\left(-\int_{t_e}^{t_n}\lambda_0(s-t_e)ds\right)\exp\left(-\int_{t_n}^{\tau'}\lambda_{n}(s-t_e)ds\right)$

This updates the survival for a transition in log space, according to Anderson's method.

$\ln u=-\int_{t_e}^{t_n}\lambda_0(s-t_e)ds - \int_{t_n}^{\tau'}\lambda_{n}(s-t_e)ds$

`CompetingClocks.sampling_space`

— FunctionThis function decides whether a particular distribution can be sampled faster and more accurately using its cumulative distribution function or using the log of its cumulative distribution function, also called the integrated hazard. The former is used for the Next Reaction method by Gibson and Bruck. The latter is used by the Modified Next Reaction method of Anderson. We are calling the first a linear space and the second a logarithmic space.