Guide to integrating FluxMPI into your code
There are essentially 6 main steps to remember:
Initialize FluxMPI
FluxMPI.Init()
.Sync Model Parameters and States
FluxMPI.synchronize!(ps; root_rank)
. (Remember to useFluxMPIFluxModel
for Flux models.)Use
DistributedDataContainer
to distribute your data evenly across the processes. (Of course an alternative is to just manually partition your data.)Wrap the optimizer in
DistributedOptimizer
or callallreduce_gradients(gs::NamedTuple)
before eveeryOptimisers.update
.Sync the optimizer state across the processes
FluxMPI.synchronize!(opt_state; root_rank)
.Change logging code to check for
local_rank
== 0.
Finally, start the code using mpiexecjl -n <np> julia --project=. <filename>.jl