HiQGA Documentation

HiQGA Documentation

This is a general purpose package for spatial statistical inference, geophysical forward modeling, Bayesian inference and inversion (both determinstic and probabilistic).

Readily usable geophysical forward operators are to do with AEM, CSEM and MT physics (references underneath), for which the time domain AEM codes are fairly production-ready. The current EM modeling is in 1D, but the inversion framework is dimensionally agnostic (e.g., you can regress images). Adding your own geophysical operators is easy, keep reading down here.

This package implements both the nested (2-layer) and vanilla trans-dimensional Gaussian process algorithm as published in

Installation

NCI users look here first!

To install, in a perfect world we'd use Julia's Pkg REPL by hitting ] to enter pkg> mode. Then enter the following, at the pkg> prompt:

pkg> add https://github.com/GeoscienceAustralia/HiQGA.jl.git

Usage

Examples of how to use the package can be found in the examples directory. Simply cd to the relevant example directory and include the .jl files in the order they are named. If using VSCode make sure to do Julia: Change to this Directory from the three dots menu on the top right. The Markov Chain Monte Carlo sampler is configured to support parallel tempering on multiple CPUs - some of the examples accomplish this with Julia's built-in multiprocessing, and others use MPI in order to support inversions on HPC clusters that don't work with Julia's default SSH-based multiprocessing. The MPI examples require MPI.jl and MPIClusterManagers.jl, which are not installed as dependencies for this package, so you will need to ensure they are installed and configured correctly to run these examples. See here for MPI on the NCI.

Some example scripts have as a dependency Revise.jl as we're still actively developing this package, so you may need to install Revise if not already installed. All Julia users should be developing with Revise! After installation, to run the examples, simply clone the package separately (or download as a ZIP), navigate to the examples folder and run the scripts in their numerical order.

Updating the package

pkg> update HiQGA

Developing HiQGA or modifying it for your own special forward physics

After you have ]added HiQGA you can simply do:

pkg>dev HiQGA

If you haven't added it already, you can do:

pkg>dev https://github.com/GeoscienceAustralia/HiQGA.jl.git

It will download to your JULIA_PKG_DEVDIR.

Another way is to simply clone or download this repository to your JULIA_PKG_DEVDIR, rename the cloned directory HiQGA removing the .jl bit and do

pkg>dev HiQGA

Here's a gist on adding your own module if you want to modify the source code. Alternatively, if you only want to use the sampling methods in HiQGA.transD_GP without contributing to the source (boo! j/k) here's another gist which is more appropriate. These gists were written originally for a package called transD_GP so you will have to modify using transD_GP to using HiQGA.transD_GP. Documentation is important and we're working on improving it before a full-release.

Development setup on NCI

You will need a Julia depot, where all packages are downloaded, and the package registry resides. While it may not be large in size, it can consume a lot of your inode (file count) quota. The easiest thing to do is set up a directory like this

mkdir/g/data/myprojectwithlotsofinodes/myusername/juliadepot

and then point a symlink to it from ***BOTH*** OOD and gadi, making sure you remove any existing .julia in your home directory with rm -rf .julia in your $HOME

cd
ln -s /g/data/myprojectwithlotsofinodes/myusername/juliadepot .julia

If you don't already have access to a julia binary, download the appropriate version .tar.gz from here and then untar it in a location you have write access to. Then, in your $HOME/bin directory on BOTH OOD and gadi make a symlink to the julia binary like so:

cd ~/bin
ln -s /g/data/somwehere/julia-x.x.x/bin/julia .

The preferred development and usage environment for HiQGA is Visual Studio Code, which provides interactive execution of Julia code through the VSCode Julia extension. To install VSCode on the National Computational Infrastructure (NCI), you need to extract the VSCode rpm package using the steps in this gist, to a location where your account has write access. You will NOT be using vscode on a gadi login node, but on OOD.

Get Julia language support from VSCode after launching the VSCode binary by going to File->Extensions by searching for Julia. If after installation it doesn't find the Julia binary, go to File->Extensions->Julia->Manage(the little gear icon) and manually type in /home/yourusername/bin/julia in the "Executable Path" field.

It is also useful to use Revise.jl to ensure changes to the package are immediately reflected in a running Julia REPL (this is the reason that Revise is a dependency on some example scripts as noted above). More information on a workflow to use Revise during development can be found here.

In your MPI job, make sure that you include in your qsub script the gdata directory in which you have your julia executable and depot, e.g.,

#PBS -l storage=gdata/z67+gdata/kb5

Installing MPI.jl and MPIClusterManagers.jl on NCI

We have found that the safest bet for MPI.jl to work without UCX issues on NCI is to use intel-mpi. A bunch of environment variables need to be set before building MPI.jl and MPIClusterManagers.jl. Goto your Julia depot (should be softlinked as ~/.julia) and edit ~/.julia/prefs/MPI.toml to enter the following lines:

path = "/apps/intel-mpi/2019.8.254/intel64/"
library = "/apps/intel-mpi/2019.8.254/intel64/lib/release/libmpi.so"
binary = "system"

Now ensure you do a

module load intel-mpi/2019.8.254

before running Julia and doing

pkg>add MPI, MPIClusterManagers, Distributed

Just to be safe, ensure that MPI has indeed built wth the version you have specified above:

Julia>using Pkg; Pkg.build("MPI", verbose=true)

and you should see linking information to intel-mpi 2019.8.254. To test, use an interactive NCI job with the following submission:

qsub -I -lwalltime=1:00:00,mem=16GB,ncpus=4,storage=gdata/z67+gdata/cr78
.
.
.
job is ready

now create a file called mpitest.jl with the following lines on some mount you have access to:

## MPI Init
using MPIClusterManagers, Distributed
import MPI
MPI.Init()
rank = MPI.Comm_rank(MPI.COMM_WORLD)
sz = MPI.Comm_size(MPI.COMM_WORLD)
if rank == 0
    @info "size is $sz"
end
manager = MPIClusterManagers.start_main_loop(MPI_TRANSPORT_ALL)
@info "there are $(nworkers()) workers"
@everywhere @info gethostname()
@show nworkers()
MPIClusterManagers.stop_main_loop(manager)
rmprocs(workers())
exit()

Run the code after loading the intel-mpi module you have linked MPI.jl against with

module load intel-mpi/2019.8.254
mpirun -np 3 julia mpitest.jl

and you should see output like:

[ Info: size is 3
[ Info: there are 2 workers
[ Info: hostname1.blah
[ Info: hostname2.blah
[ Info: hostname3.blah
nworkers() = 2

This is the basic recipe for all the cluster HiQGA jobs on NCI. After the call to manager = MPIClusterManagers.start_main_loop(MPI_TRANSPORT_ALL), standard MPI execution stops, and we return to an explicit manager-worker mode with code execution only continuing on the manager which is Julia process 1.

Installing PyPlot on NCI

Due to indode restrictions on NCI, we've resorted to using a communal matplotlib install as follows:

pkg> rm Conda
pkg> rm PyCall
pkg> rm PyPlot
pkg> rm HiQGA
rm -rf conda/
module load python3/3.8.5
source /g/data/z67/matplotlib-venv/bin/activate
PYTHON=/g/data/z67/matplotlib-venv/bin/python julia

Install and build PyCall:

pkg> add PyCall
pkg> build PyCall
julia> exit()

exit Julia and then restart Julia and in Pkg mode:

pkg> add PyPlot
pkg> dev HiQGA

References for AEM and CSEM physics