Samplers

The choice of sampler determines specific algorithms that are used to sample, update, and disable clocks. Helpers also exist that are useful for logging, utilizing common random numbers, and hierarchical sampling.

Sampler Supertype

CompetingClocks.SSA — Type

SSA{KeyType,TimeType}

This abstract type represents a stochastic simulation algorithm (SSA). It is parametrized by the clock ID, or key, and the type used for the time, which is typically a Float64. The type of the key can be anything you would use as a dictionary key. This excludes mutable values but includes a wide range of identifiers useful for simulation. For instance, it could be a String, but it could be a Tuple{Int64,Int64,Int64}, so that it indexes into a complicated simulation state.

Sampler Types

CompetingClocks.FirstReaction — Type

FirstReaction{KeyType,TimeType}()

This is the classic first reaction method for general distributions. Every time you sample, this goes to each distribution and asks when it would fire. Then it takes the soonest and throws out the rest of the sampled times until the next sample. It can also be very fast when there are only a few clocks to sample.

One interesting property of this sampler is that you can call next() multiple times in order to get a distribution of next firing clocks and their times to fire.

CompetingClocks.FirstToFire — Type

FirstToFire{KeyType,TimeType}()

This sampler is often the fastest for non-exponential distributions. When a clock is first enabled, this sampler asks the clock when it would fire and saves that time in a sorted heap of future times. Then it works through the heap, one by one. When a clock is disabled, its future firing time is removed from the list. There is no memory of previous firing times.

CompetingClocks.DirectCall — Type

DirectCall{KeyType,TimeType,TreeType}()

DirectCall is responsible for sampling among Exponential distributions. It samples using the Direct method. In this case, there is no optimization to that Direct method, so we call it DirectCall because it recalculates everything every time you call it.

The algorithm for the Direct Method relies heavily on what data structure it uses to maintain a list of hazard rates, such that it can know the sum of those hazards and index into them using a random value. This struct has a default constructor that chooses a data structure for you, but there are several options.

Example

If we know that our simulation will only use a small number of different clock keys, then it would make sense to use a data structure that disables clocks by zeroing them out, instead of removing them from the list. This will greatly reduce memory churn. We can do that by changing the underlying data structure.

prefix_tree = BinaryTreePrefixSearch{T}()
keyed_prefix_tree = KeyedKeepPrefixSearch{K,typeof(prefix_tree)}(prefix_tree)
sampler_noremove = DirectCall{K,T,typeof(keyed_prefix_tree)}(keyed_prefix_tree)

CompetingClocks.CombinedNextReaction — Type

CombinedNextReaction{KeyType,TimeType}()

This combines Next Reaction Method and Modified Next Reaction Method. The Next Reaction Method is from Gibson and Bruck in their 2000 paper called "Efficient Exact Stochastic Simulation of Chemical Systems with Many Species and Many Channels". The Modified Next Reaction Method is from David F. Anderson's 2007 paper, "A modified Next Reaction Method for simulating chemical systems with time dependent propensities and delays". Both methods reuse draws of random numbers. The former works by accumulating survival of a distribution in a linear space and the latter works by accumulating survival of a distribution in a log space.

Each enabled clock specifies a univariate distribution from the Distributions package. Every distribution is more precise being sampled in the manner of the Next Reaction method (linear space) or the manner of the Modified Next Reaction method (log space). This sampler chooses which space to use depending on the type of the UnivariateDistribution and based on performance timings that are done during package testing. Defaults are set for those distributions included in the Distributions.jl package. If you want to add a distribution, then define:

sampling_space(::MyDistribution) = LogSampling

If you want to override a choice in the library, then create a sub-type of the given distribution, and specify its sampling space.

struct LinearGamma <: Distributions.Gamma end
sampling_space(::LinearGamma) = LinearSampling

If you want to test a distribution, look at tests/nrmetric.jl to see how distributions are timed.

Sampling Helpers

CompetingClocks.CommonRandomRecorder — Type

Common random variates, also called common random numbers (CRN), are a technique for reducing variance when repeating runs of simulations with different parameters. The idea is to record the randomness of one simulation and replay the same choices in subsequent runs. This particular implementation does this by saving the state of the random number generator every time it's used by a sampler.

The Xoshiro sampler has a relatively small state (32 bytes), which is saved every time the sampler uses random numbers. This CRN recorder saves data in memory, but we could save that to a memory-mapped file so that the operating system will optimize transfer of that memory to disk.

What happens when replays of simulation runs use more draws than the first, recorded simulation? Those simulations draw from a fresh random number generator. This is not an exact approach.

Example

The goal is to run the simulation with ten different parameter sets and measure how much different parameters change the mean of some quantity determined by the trajectories.

using Random: Xoshiro
using CompetingClocks
example_clock = (3, 7)  # We will use clock IDs that are a tuple of 2 integers.
sampler = FirstToFire{typeof(example_clock)}()
crn_sampler = CommonRandomRecorder(sampler, typeof(example_clock), Xoshiro)
for trial_idx in 1:100
    run_simulation(model, crn_sampler)
    reset!(crn_sampler)
end
for param_idx in 1:10
    each_model = modify_model!(model, param_idx)
    run_simulation(each_model, crn_sampler)
    reset!(crn_sampler)
end

CompetingClocks.freeze — Function

freeze(cr::CommonRandomRecorder)::FrozenCommonRandomRecorder

The CommonRandomRecorder records every time it sees a clock request random number generation. It continues to do that every time it runs, which is a problem if you run simulations for comparison on multiple threads. If you want to use CRN and to use multiple threads for subsequent simulation runs, then first run the simulation a bunch of times on one thread. Then freeze the simulation, and then the frozen version will stop remembering new threads.

There is one part of the frozen recorder that will be mutable because it's useful for debugging, the record of missed clocks. Freeze a recorder for each thread, and each thread will track its own misses. They will all work from the same copy of the recorded random number generator states.

CompetingClocks.misscount — Function

misscount(recorder::CommonRandomRecorder)

The common random recorder watches a simulation and replays the states of the random number generator on subsequent runs. This counts the number of times during the most recent run that a clock event happened that could not be replayed.

CompetingClocks.misses — Function

misses(recorder::CommonRandomRecorder)

This iterates over pairs of misses in the common random recorder during the most recent simulation run, where the start of a simulation run was marked by calling reset!.

CompetingClocks.MultiSampler — Type

MultiSampler{SamplerKey,Key,Time}(which_sampler::Function)

This makes a sampler that uses multiple stochastic sampling algorithms (SSA) to determine the next transition to fire. It returns the soonest transition of all of the algorithms. The which_sampler function looks at the clock ID, or key, and chooses which sampler should sample this clock. Add algorithms to this sampler like you would add them to a dictionary.

Once a clock is first enabled, it will always go to the same sampler. This sampler remembers the associations, which could increase memory for simulations with semi-infinite clocks.

Examples

Let's make one sampler for exponential distributions, one for a few clocks we know will be fast and one for slower clocks. We can name them with symbols. The trick is that we need to direct each kind of distribution to the correct sampler. Use a Float64 for time and each clock can be identified with an Int64.

using CompetingClocks
using Distributions: Exponential, UnivariateDistribution

struct ByDistribution <: SamplerChoice{Int64,Symbol} end

function CompetingClocks.choose_sampler(
    chooser::ByDistribution, clock::Int64, distribution::Exponential
    )::Symbol
    return :direct
end
function CompetingClocks.choose_sampler(
    chooser::ByDistribution, clock::Int64, distribution::UnivariateDistribution
    )::Symbol
    if clock < 100
        return :fast
    else
        return :slow
    end
end
sampler = MultiSampler{Symbol,Int64,Float64}(ByDistribution())
sampler[:direct] = OptimizedDirect{Int64,Float64}()
sampler[:fast] = FirstToFire{Int64,Float64}()
sampler[:slow] = FirstToFire{Int64,Float64}()

CompetingClocks.SingleSampler — Type

SingleSampler{SSA,Time}(propagator::SSA)

This makes a sampler from a single stochastic simulation algorithm. It combines the core algorithm with the rest of the state of the system, which is just the time.

CompetingClocks.ChatReaction — Type

This sampler can help if it's the first time you're trying a model. It checks all of the things and uses Julia's logger to communicate them. It samples using the first reaction algorithm.

CompetingClocks.DebugWatcher — Type

DebugWatcher()

For debugging, it helps to have visibility into the simulation. This Watcher records everything that is enabled or disabled as a list of all enables and all disabled. It's the complete event history, and you can think of it as the filtration for the process going forward.

watcher = DebugWatcher{String}()
# enable and disable some things.
(watcher.enabled[1].clock,
watcher.enabled[1].distribution,
watcher.enabled[1].te,
watcher.enabled[1].when)

CompetingClocks.TrackWatcher — Type

TrackWatcher{K,T}()

This Watcher doesn't sample. It records everything enabled. You can iterate over enabled clocks with a for-loop. If we think of the model as providing changes in which transitions are enabled or disabled, this Watcher accumulates those changes to provide a consistent list of all enabled transitions. Together, a model and this Watcher provide the Semi-Markov core matrix, or the row of it that is currently known.

for entry in tracker
    entry.clock
    entry.distribution
    entry.te
    entry.when
end

CompetingClocks.consume_survival — Function

This updates the survival for a transition in the linear space, according to Gibson and Bruck. Transition was enabled between time record t0 and tn. Divide the survival by the conditional survival between t0 and tn. te can be before t0, at t0, between t0 and tn, or at tn, or after t_n.

$u=\exp\left(-\int_{t_e}^{t_n}\lambda_0(s-t_e)ds\right)\exp\left(-\int_{t_n}^{\tau'}\lambda_{n}(s-t_e)ds\right)$

This updates the survival for a transition in log space, according to Anderson's method.

$\ln u=-\int_{t_e}^{t_n}\lambda_0(s-t_e)ds - \int_{t_n}^{\tau'}\lambda_{n}(s-t_e)ds$

CompetingClocks.sampling_space — Function

This function decides whether a particular distribution can be sampled faster and more accurately using its cumulative distribution function or using the log of its cumulative distribution function, also called the integrated hazard. The former is used for the Next Reaction method by Gibson and Bruck. The latter is used by the Modified Next Reaction method of Anderson. We are calling the first a linear space and the second a logarithmic space.