Parallel Simulation

You can parallelize simulations in two ways:

  1. You can do independent parallel simulations with the @threads-macro[1] or with the Distributed standard library.
  2. You can execute a single simulation on multiple threads to speed it up.

Multi-Threading (Experimental)

Multithreading of simulations is introduced with DiscreteEvents v0.3 and will take some time to become stable. Please try it out and report your problems!

If we compute events of a DES on parallel cores of a computer, we may reverse a sequence $\;e_i, e_j\;$ to $\;e_j, e_i\;$. If there is causality between those events, we have a problem. Therefore we cannot spawn arbitrary events to parallel cores without altering causality and the simulated outcome.

Fortunately not all events in larger DES are strongly coupled. For most practical purposes we can divide systems into subsystems where events depend on each other but not or only statistically on events in other subsystems. Subsystems have local time and their clocks get synchronized periodically.

Thread-local Clocks

With PClock we introduce parallel local clocks on each thread. The master clock on thread 1 synchronizes with its parallel clocks each chosen time interval $\;Δt\;$. Synchronization takes some time and the slowest thread with the biggest workload (usually thread 1) sets the pace for the whole computation.

When using the keywords cid or spawn with event!, periodic! and process! we can work with parallel clocks. Then

  • events and processes get registered to parallel clocks,
  • processes get started on parallel threads and
  • their functions get the thread local clock to delay! or wait! on it.

A user should avoid to share global variables between threads in order not to get race conditions. If thread-local subsystems get inputs from each other, they should communicate over Julia channels, which are thread safe.

When working on parallel threads, we have thread-local random number generators. Random number sequences therefore are not identical between single-threaded and multithreaded applications (see below). This usually causes also simulation results to be different.

Speedup and Balance

First results show a considerable speedup of multithreaded simulations vs single-threaded ones:

  1. The Multithreaded Assembly Line on DiscreteEventsCompanion took 1.58 s to run 63452 events on 8 parallel cores vs. 70287 events in 10.77 s on thread 1 of the same machine [2].
  2. If we put the simulated assembly operations only on threads 2-8, it took only 0.67 s.
  3. With all assembly operations together on thread 2, it took 4.96 s.

The 2nd and 3rd results show that considerable speedups can yet be realized by relieving thread 1 and distributing the workload between the other ones.

Random Numbers

To get reproducible but different random number sequences on each thread, you can seed the thread-specific global default random number generators with pseed!. It will seed! the thread-specific default RNGs with the given number multiplied by the thread id $n\times t_i$. Calls to rand() are thread-specific and will then use the seeded default RNGs.

At this time of writing all implicit calls to rand() in timed event!s or in delay! use the default RNGs.

Alternatively you can seed a thread-specific default RNG with:

using DiscreteEvents, Random

onthread(2) do  # seed the default RNG on thread 2
    Random.seed!(123)
end

Documentation and Examples

You can find more documentation and examples on DiscreteEventsCompanion.

  • 1Goldratt's Dice Game on DiscreteEventsCompanion illustrates how to do this.
  • 2Event count and line throughput are different between multi- and single-threaded because the random number sequence changes between these examples.