
Julia clusters

addprocs(template, ninstances[; kwargs...])

Add Azure scale set instances where template is either a dictionary produced via the AzManagers.build_sstemplate method or a string corresponding to a template stored in ~/.azmanagers/templates_scaleset.json.

key word arguments:

  • subscriptionid=template["subscriptionid"] if exists, or AzManagers._manifest["subscriptionid"] otherwise.
  • resourcegroup=template["resourcegroup"] if exists, or AzManagers._manifest["resourcegroup"] otherwise.
  • sigimagename="" The name of the SIG image[1].
  • sigimageversion="" The version of the sigimagename[1].
  • imagename="" The name of the image (alternative to sigimagename and sigimageversion used for development work).
  • osdisksize=60 The size of the OS disk in GB.
  • customenv=false If true, then send the current project environment to the workers where it will be instantiated.
  • session=AzSession(;lazy=true) The Azure session used for authentication.
  • group="cbox" The name of the Azure scale set. If the scale set does not yet exist, it will be created.
  • overprovision=true Use Azure scle-set overprovisioning?
  • ppi=1 The number of Julia processes to start per Azure scale set instance.
  • julia_num_threads="$(Threads.nthreads(),$(Threads.nthreads(:interactive))" set the number of julia threads for the detached process.[2]
  • omp_num_threads=get(ENV, "OMP_NUM_THREADS", 1) set the number of OpenMP threads to run on each worker
  • exeflags="" set additional command line start-up flags for Julia workers. For example, --heap-size-hint=1G.
  • env=Dict() each dictionary entry is an environment variable set on the worker before Julia starts. e.g. env=Dict("OMP_PROC_BIND"=>"close")
  • nretry=20 Number of retries for HTTP REST calls to Azure services.
  • verbose=0 verbose flag used in HTTP requests.
  • save_cloud_init_failures=false set to true to copy cloud init logs (/var/log/clout-init-output.log) from workers that fail to join the cluster.
  • show_quota=false after various operation, show the "x-ms-rate-remaining-resource" response header. Useful for debugging/understanding Azure quota's.
  • user=AzManagers._manifest["ssh_user"] ssh user.
  • spot=false use Azure SPOT VMs for the scale-set
  • maxprice=-1 set maximum price per hour for a VM in the scale-set. -1 uses the market price.
  • spot_base_regular_priority_count=0 If spot is true, only start adding spot machines once there are this many non-spot machines added.
  • spot_regular_percentage_above_base If spot is true, then when ading new machines (above spot_base_reqular_priority_count) use regular (non-spot) priority for this percent of new machines.
  • waitfor=false wait for the cluster to be provisioned before returning, or return control to the caller immediately[3]
  • mpi_ranks_per_worker=0 set the number of MPI ranks per Julia worker[4]
  • mpi_flags="-bind-to core:$(ENV["OMP_NUM_THREADS"]) -map-by numa" extra flags to pass to mpirun (has effect when mpi_ranks_per_worker>0)
  • nvidia_enable_ecc=true on NVIDIA machines, ensure that ECC is set to true or false for all GPUs[5]
  • nvidia_enable_mig=false on NVIDIA machines, ensure that MIG is set to true or false for all GPUs[5]
  • hyperthreading=nothing Turn on/off hyperthreading on supported machine sizes. The default uses the setting in the template. To override the template setting, use true (on) or false (off).


[1] If addprocs is called from an Azure VM, then the default imagename,imageversion are the image/version the VM was built with; otherwise, it is the latest version of the image specified in the scale-set template. [2] Interactive threads are supported beginning in version 1.9 of Julia. For earlier versions, the default for julia_num_threads is Threads.nthreads(). [3] waitfor=false reflects the fact that the cluster manager is dynamic. After the call to addprocs returns, use workers() to monitor the size of the cluster. [4] This is inteneded for use with Devito. In particular, it allows Devito to gain performance by using MPI to do domain decomposition using MPI within a single VM. If mpi_ranks_per_worker=0, then MPI is not used on the Julia workers. [5] This may result in a re-boot of the VMs

ispreempted,notbefore = preempted([id=myid()|id="instanceid"])

Check to see if the machine id::Int has received an Azure spot preempt message. Returns (true, notbefore) if a preempt message is received and (false,"") otherwise. notbefore is the date/time before which the machine is guaranteed to still exist.

Detached service

addproc(template[; name="", basename="cbox", subscriptionid="myid", resourcegroup="mygroup", nretry=10, verbose=0, session=AzSession(;lazy=true), sigimagename="", sigimageversion="", imagename="", detachedservice=true])

Create a VM, and returns a named tuple (name,ip,resourcegrup,subscriptionid) where name is the name of the VM, and ip is the ip address of the VM. resourcegroup and subscriptionid denote where the VM resides on Azure.


  • name="" name for the VM. If it is not an empty string, then the next paramter (basename) is ignored
  • basename="cbox" base name for the VM, we append a random suffix to ensure uniqueness
  • subscriptionid=template["subscriptionid"] if exists, or AzManagers._manifest["subscriptionid"] otherwise.
  • resourcegroup=template["resourcegroup"] if exists, or AzManagers._manifest["resourcegroup"] otherwise.
  • session=AzSession(;lazy=true) Session used for OAuth2 authentication
  • sigimagename="" Azure shared image gallery image to use for the VM (defaults to the template's image)
  • sigimageversion="" Azure shared image gallery image version to use for the VM (defaults to latest)
  • imagename="" Azure image name used as an alternative to sigimagename and sigimageversion (used for development work)
  • osdisksize=60 Disk size of the OS disk in GB
  • customenv=false If true, then send the current project environment to the workers where it will be instantiated.
  • nretry=10 Max retries for re-tryable REST call failures
  • verbose=0 Verbosity flag passes to HTTP.jl methods
  • show_quota=false after various operation, show the "x-ms-rate-remaining-resource" response header. Useful for debugging/understanding Azure quota's.
  • julia_num_threads="$(Threads.nthreads(),$(Threads.nthreads(:interactive))" set the number of julia threads for the workers.[1]
  • omp_num_threads = get(ENV, "OMP_NUM_THREADS", 1) set OMP_NUM_THREADS environment variable before starting the detached process
  • env=Dict() Dictionary of environemnt variables that will be exported before starting the detached process
  • detachedservice=true start the detached service allowing for RESTful remote code execution


[1] Interactive threads are supported beginning in version 1.9 of Julia. For earlier versions, the default for julia_num_threads is Threads.nthreads().

@detachat myvm begin ... end

Run code on an Azure VM.


using AzManagers
myvm = addproc("myvm")
job = @detachat myvm begin
    @info "I'm running detached"

Retrieve a variable from a variable bundle. See variablebundle! for more information.


Define variables that will be passed to a detached job.


using AzManagers
myvm = addproc("myvm")
myjob = @detachat myvm begin
    write(stdout, "my variable is $(variablebundle(:x))

returns the stdout from a detached job.

rmproc(vm[; session=AzSession(;lazy=true), verbose=0, nretry=10])

Delete the VM that was created using the addproc method.


  • session=AzSession(;lazy=true) Azure session for OAuth2 authentication
  • verbose=0 verbosity flag passed to HTTP.jl methods
  • nretry=10 max number of retries for retryable REST calls
  • show_quota=false after various operation, show the "x-ms-rate-remaining-resource" response header. Useful for debugging/understanding Azure quota's.

blocks until the detached job, job, is complete.


AzManagers.build_nictemplate(nic_name; kwargs...)

Returns a dictionary for a NIC template, and that can be passed to the addproc method, or written to AzManagers.jl configuration files.

Required keyword arguments

  • subscriptionid Azure subscription
  • resourcegroup_vnet Azure resource group that holds the virtual network that the NIC is attaching to.
  • vnet Azure virtual network for the NIC to attach to.
  • subnet Azure sub-network name.
  • location location of the Azure data center where the NIC correspond to.

Optional keyword arguments

  • accelerated=true use accelerated networking (not all VM sizes support accelerated networking).
AzManagers.build_sstemplate(name; kwargs...)

returns a dictionary that is an Azure scaleset template for use in addprocs or for saving to the ~/.azmanagers folder.

required key-word arguments

  • subscriptionid Azure subscription
  • admin_username ssh user for the scaleset virtual machines
  • location Azure data-center location
  • resourcegroup Azure resource-group
  • imagegallery Azure image gallery that contains the VM image
  • imagename Azure image
  • vnet Azure virtual network for the scaleset
  • subnet Azure virtual subnet for the scaleset
  • skuname Azure VM type

optional key-word arguments

  • subscriptionid_image Azure subscription corresponding to the image gallery, defaults to subscriptionid
  • resourcegroup_vnet Azure resource group corresponding to the virtual network, defaults to resourcegroup
  • resourcegroup_image Azure resource group correcsponding to the image gallery, defaults to resourcegroup
  • osdisksize=60 Disk size in GB for the operating system disk
  • skutier = "Standard" Azure SKU tier.
  • datadisks=[] list of data disks to create and attach [1]
  • tempdisk = "sudo mkdir -m 777 /mnt/scratch; ln -s /mnt/scratch /scratch" cloud-init commands used to mount or link to temporary disk
  • tags = Dict("azure_tag_name" => "some_tag_value") Optional tags argument for resource
  • encryption_at_host = false Optional argument for enabling encryption at host


[1] Each datadisk is a Dictionary. For example,

Dict("createOption"=>"Empty", "diskSizeGB"=>1023, "managedDisk"=>Dict("storageAccountType"=>"PremiumSSD_LRS"))

or, to accept the defaults,


The above example is populated with the default options. So, if datadisks=[Dict()], then the default options will be included.

AzManagers.build_vmtemplate(vm_name; kwargs...)

Returns a dictionary for a virtual machine template, and that can be passed to the addproc method or written to AzManagers.jl configuration files.

Required keyword arguments

  • subscriptionid Azure subscription
  • admin_username ssh user for the scaleset virtual machines
  • location Azure data center location
  • resourcegroup Azure resource group where the VM will reside
  • imagegallery Azure shared image gallery name
  • imagename Azure image name that is in the shared image gallery
  • vmsize Azure vm type, e.g. "StandardD8sv3"

Optional keyword arguments

  • resourcegroup_vnet Azure resource group containing the virtual network, defaults to resourcegroup
  • subscriptionid_image Azure subscription containing the image gallery, defaults to subscriptionid
  • resourcegroup_image Azure resource group containing the image gallery, defaults to subscriptionid
  • nicname = "cbox-nic" Name of the NIC for this VM
  • osdisksize = 60 size in GB of the OS disk
  • datadisks=[] additional data disks to attach
  • `tempdisk = "sudo mkdir -m 777 /mnt/scratch

ln -s /mnt/scratch /scratch"` cloud-init commands used to mount or link to temporary disk

  • tags = Dict("azure_tag_name" => "some_tag_value") Optional tags argument for resource
  • encryption_at_host = false Optional argument for enabling encryption at host


[1] Each datadisk is a Dictionary. For example,

Dict("createOption"=>"Empty", "diskSizeGB"=>1023, "managedDisk"=>Dict("storageAccountType"=>"PremiumSSD_LRS"))

The above example is populated with the default options. So, if datadisks=[Dict()], then the default options will be included.

AzManagers.write_manifest(;resourcegroup="", subscriptionid="", ssh_user="", ssh_public_key_file="~/.ssh/", ssh_private_key_file="~/.ssh/azmanagers_rsa")

Write an AzManagers manifest file (~/.azmanagers/manifest.json). The manifest file contains information specific to your Azure account.

AzManagers.save_template_nic(nic_name, template)

Save template::Dict generated by AzManagers.buildnictmplate to /juliateam/.azmanagers/templatesnic.json.

AzManagers.save_template_scaleset(scalesetname, template)

Save template::Dict generated by AzManagers.buildsstemplate to /juliateam/.azmanagers/templatesscaleset.json.

AzManagers.save_template_vm(vm_name, template)

Save template::Dict generated by AzManagers.buildvmtmplate to /juliateam/.azmanagers/templatesvm.json.