Amazon Braket Hybrid Jobs

Amazon Braket Hybrid Jobs allow you to easily run hybrid classical-quantum workflows on AWS managed infrastructure by submitting your own scripts which run in a Docker container, either one provided by Amazon Braket or one that is made available to you through Amazon ECR. To learn more about Amazon Braket Hybrid Jobs, see the Developer Guide, and to learn how to provide your own Docker images, see the Bring Your Own Container (BYOC) guide.

You can also run a LocalJob, which runs the container and your script on your compute hardware (your laptop, or an EC2 instance, for example), using LocalQuantumJob. This can be useful for debugging and performance tuning purposes.

Braket.JobType
Job

Abstract type representing a Braket Job.

Braket.AwsQuantumJobMethod
AwsQuantumJob(device::Union{String, BraketDevice}, source_module::String; kwargs...)

Create and launch an AwsQuantumJob which will use device device (a managed simulator, a QPU, or an embedded simulator) and will run the code (either a single file, or a Julia package, or a Python module) located at source_module. The keyword arguments kwargs control the launch configuration of the job. device can be either the device's ARN as a String, or a BraketDevice.

Keyword Arguments

  • entry_point::String - the function to run in source_module if source_module is a Python module/Julia package. Defaults to an empty string, in which case the behavior depends on the code language. In Python, the job will attempt to find a function called main in source_module and run it. In Julia, source_module will be loaded and run with Julia's include.
  • image_uri::String - the URI of the Docker image in ECR to run the Job on. Defaults to an empty string, in which case the base container is used.
  • job_name::String - the name for the job, which will be displayed in the jobs console. The default is a combination of the container image name and the current time.
  • code_location::String - the S3 prefix URI to which code will be uploaded. The default is default_bucket()/jobs/<job_name>/script
  • role_arn::String - the IAM role ARN to use to run the job. The default is to use the default jobs role.
  • wait_until_complete::Bool - whether to block until the job is complete, displaying log information as it arrives (true) or to run the job asynchronously (false, default).
  • hyperparameters::Dict{String, Any} - hyperparameters to provide to the job which will be available from an environment variable when the job is run. See the Amazon Braket documentation for more.
  • input_data::Union{String, Dict} - information about the training/input data to provide to the job. A Dict should map channel names to local paths or S3 URIs. Contents found at any local paths encoded as Strings will be uploaded to S3 at s3://{default_bucket_name}/jobs/{job_name}/data/{channel_name}. If a local path or S3 URI is provided, it will be given a default channel name "input". The default is Dict().
  • instance_config::InstanceConfig - the instance configuration to use to run the job. See the Amazon Braket documentation for more information about available instance types. The default is InstanceConfig("ml.m5.large", 1, 30).
  • distribution::String - specifies how the job should be distributed. If set to "data_parallel", the hyperparameters for the job will be set to use data parallelism features for PyTorch or TensorFlow.
  • stopping_condition::StoppingCondition - the maximum length of time, in seconds, that a job can run before being forcefully stopped. The default is StoppingCondition(5 * 24 * 60 * 60).
  • output_data_config::OutputDataConfig - specifies the location for the output of the job. Any data stored here will be available to download_result and results. The default is OutputDataConfig("s3://{default_bucket_name}/jobs/{job_name}/data").
  • copy_checkpoints_from_job::String - specifies the job ARN whose checkpoint is to be used in the current job. Specifying this value will copy over the checkpoint data from use_checkpoints_from_job's checkpoint_config S3 URI to the current job's checkpoint_config S3 URI, making it available at checkpoint_config.localPath during the job execution. The default is not to copy any checkpoints (an empty string).
  • checkpoint_config::CheckpointConfig - specifies the location where checkpoint data for this job is to be stored. The default is CheckpointConfig("/opt/jobs/checkpoints", "s3://{default_bucket_name}/jobs/{job_name}/checkpoints").
  • tags::Dict{String, String} - specifies the key-value pairs for tagging this job.
Braket.LocalQuantumJobMethod
LocalQuantumJob(device::Union{String, BraketDevice}, source_module::String; kwargs...)

Create and launch a LocalQuantumJob which will use device device (a managed simulator, a QPU, or an embedded simulator) and will run the code (either a single file, or a Julia package, or a Python module) located at source_module. device can be either the device's ARN as a String, or a BraketDevice. A local job runs locally on your computational resource by launching the Job container locally using docker. The job will block until it completes, replicating the wait_until_complete behavior of AwsQuantumJob.

The keyword arguments kwargs control the launch configuration of the job.

Keyword Arguments

  • entry_point::String - the function to run in source_module if source_module is a Python module/Julia package. Defaults to an empty string, in which case the behavior depends on the code language. In Python, the job will attempt to find a function called main in source_module and run it. In Julia, source_module will be loaded and run with Julia's include.
  • image_uri::String - the URI of the Docker image in ECR to run the Job on. Defaults to an empty string, in which case the base container is used.
  • job_name::String - the name for the job, which will be displayed in the jobs console. The default is a combination of the container image name and the current time.
  • code_location::String - the S3 prefix URI to which code will be uploaded. The default is default_bucket()/jobs/<job_name>/script
  • role_arn::String - not used for LocalQuantumJobs.
  • hyperparameters::Dict{String, Any} - hyperparameters to provide to the job which will be available from an environment variable when the job is run. See the Amazon Braket documentation for more.
  • input_data::Union{String, Dict} - information about the training/input data to provide to the job. A Dict should map channel names to local paths or S3 URIs. Contents found at any local paths encoded as Strings will be uploaded to S3 at s3://{default_bucket_name}/jobs/{job_name}/data/{channel_name}. If a local path or S3 URI is provided, it will be given a default channel name "input". The default is Dict().
  • instance_config::InstanceConfig - not used for LocalQuantumJobs.
  • distribution::String - not used for LocalQuantumJobs.
  • stopping_condition::StoppingCondition - the maximum length of time, in seconds, that a job can run before being forcefully stopped. The default is StoppingCondition(5 * 24 * 60 * 60).
  • output_data_config::OutputDataConfig - specifies the location for the output of the job. Any data stored here will be available to download_result and results. The default is OutputDataConfig("s3://{default_bucket_name}/jobs/{job_name}/data").
  • copy_checkpoints_from_job::String - specifies the job ARN whose checkpoint is to be used in the current job. Specifying this value will copy over the checkpoint data from use_checkpoints_from_job's checkpoint_config S3 URI to the current job's checkpoint_config S3 URI, making it available at checkpoint_config.localPath during the job execution. The default is not to copy any checkpoints (an empty string).
  • checkpoint_config::CheckpointConfig - specifies the location where checkpoint data for this job is to be stored. The default is CheckpointConfig("/opt/jobs/checkpoints", "s3://{default_bucket_name}/jobs/{job_name}/checkpoints").
  • tags::Dict{String, String} - specifies the key-value pairs for tagging this job.
Braket.log_metricFunction
log_metric(metric_name::String, value::Union{Float64, Int}; timestamp=time(), iteration_number=nothing)

Within a job script, log a metric with name metric_name and value value which can later be fetched outside the job with metrics. A metric might be, for example, the loss of a training algorithm at each epoch, or similar.

Braket.metricsFunction
metrics(j::AwsQuantumJob; metric_type="timestamp", statistic="max")

Fetches the metrics for job j. Metrics are generated by log_metric within the job script.

metrics(j::LocalQuantumJob, metric_type=TIMESTAMP, statistic="MAX")

Fetches the metrics for job j. Metrics are generated by log_metric within the job script.

Braket.logsFunction
logs(j::AwsQuantumJob; wait::Bool=false, poll_interval_seconds::Int=5)

Fetches the logs of job j. If wait is true, blocks until j has entered a terminal state ("COMPLETED", "FAILED", or "CANCELLED"). Polls every poll_interval_seconds for new log data.

logs(j::LocalQuantumJob; kwargs...)

Fetches the logs of job j.

Braket.download_resultFunction
download_result(j::AwsQuantumJob; kwargs...)

Download and extract the results of job j. Valid kwargs are:

  • extract_to::String - the local folder to extract the results to. Default is the current working directory.
  • poll_timeout_seconds::Int - the maximum number of seconds to wait while polling for results. Default: 864000
  • poll_interval_seconds::Int - how many seconds to wait between download attempts. Default: 5
Braket.@hybrid_jobMacro
@hybrid_job [device] [job_creation_kwargs] job_function(args...; kwargs..)

Run job_function inside an Amazon Braket Job, launching the job with creation arguments defined by job_creation_kwargs, and reserving device device (may be empty, in which case local:local/none is used). device should be either a valid AWS device ARN or use the format local:<simulator_provider>/<simulator_name> (see the developer guide on embedded simulators).

Valid job creation keyword arguments are: - jl_dependencies::String: a path to a Project.toml containing the Julia packages needed to run job_function. Can be "" (default). - py_dependencies::String: a path to a requirements.txt containing the Python packages needed to run job_function. Can be "" (default). - as_local::Bool: whether to run the job in local mode. Default is false, running as a Hybrid, non-local Job. - include_modules: unused but reserved argument. - using_jl_pkgs::Union{String, Vector{String}}: Julia packages to load with using [pkgs] before the job_function is called within the job. - include_jl_files::Union{String, Vector{String}}: path(s) to Julia file(s) to load with include(file) before job_function is called within the job. - creation arguments for AwsQuantumJob

Currently, args and kwargs to job_function must be serializable by JLD2.jl. job_function must be a Julia function, not Python.

Note

The paths to include files and dependencies are resolved from the call location of this macro - to ensure your paths will resolve correctly, use absolute, not relative, paths.

Examples

function my_job_func(a, b::Int; c=0, d::Float64=1.0, kwargs...)
    Braket.save_job_result(job_helper())
    py_reqs = read(joinpath(Braket.get_input_data_dir(), "requirements.txt"), String)
    hyperparameters = Braket.get_hyperparameters()
    write("test/output_file.txt", "hello")
    return 0
end

py_deps = joinpath(@__DIR__, "requirements.txt")
jl_deps = joinpath(@__DIR__, "JobProject.toml")
input_data = joinpath(@__DIR__, "requirements")
include_jl_files = joinpath(@__DIR__, "job_test_script.jl")

j = @hybrid_job Braket.SV1() wait_until_complete=true as_local=false include_modules="job_test_script" using_jl_pkgs="LinearAlgebra" include_jl_files=include_jl_files py_dependencies=py_deps jl_dependencies=jl_deps input_data=input_data my_job_func(MyStruct(), 2, d=5.0, extra_kwarg="extra_value")