Example of the use of pmapreduce

The function pmapreduce performs a parallel mapreduce. This is primarily useful when the function has to perform an expensive calculation, that is the evaluation time per core exceeds the setup and communication time. This is also useful when each core is allocated memory and has to work with arrays that won't fit into memory collectively, as is often the case on a cluster.

We walk through an example where we initialize and concatenate arrays in serial and in parallel.

We load the necessary modules first

using ParallelUtilities
using Distributed

We define the function that performs the initialization on each core. This step is embarassingly parallel as no communication happens between workers. We simulate an expensive calculation by adding a sleep interval for each index.

function initialize(sleeptime)
    A = Array{Int}(undef, 20, 20)
    for ind in eachindex(A)
        sleep(sleeptime)
        A[ind] = ind
    end
    return A
end

Next we define the function that calls pmapreduce:

function main_pmapreduce(sleeptime)
    pmapreduce(x -> initialize(sleeptime), hcat, 1:20)
end

We also define a function that carries out a serial mapreduce:

function main_mapreduce(sleeptime)
    mapreduce(x -> initialize(sleeptime), hcat, 1:20)
end

We compare the performance of the serial and parallel evaluations using 20 cores on one node:

We define a caller function first

function compare_with_serial()
    # precompile
    main_mapreduce(0)
    main_pmapreduce(0)

    # time
    println("Tesing serial")
    A = @time main_mapreduce(5e-6)
    println("Tesing parallel")
    B = @time main_pmapreduce(5e-6)

    # check results
    println("Results match : ", A == B)
end

We run this caller on the cluster:

julia> compare_with_serial()
Tesing serial
  9.457601 seconds (40.14 k allocations: 1.934 MiB)
Tesing parallel
  0.894611 seconds (23.16 k allocations: 1.355 MiB, 2.56% compilation time)
Results match : true

The full script may be found in the examples directory.