Package manual

Reading / downloading data

Process data products

DataRegistryUtils.fetch_data_per_yamlFunction
fetch_data_per_yaml(yaml_filepath, out_dir = "./out/"; use_axis_arrays::Bool = false, verbose = false, ...)

Refresh and load data products from the SCRC data registry. Checks the file hash for each data product and downloads anew any that are determined to be out-of-date.

Parameters

  • yaml_filepath – the location of a .yaml file.
  • out_dir – the local system directory where data will be stored.
  • use_axis_arrays – convert the output to AxisArrays, where applicable.
  • use_sql – load SQLite database and return connection.
  • sql_file – (optional) SQL file for e.g. custom SQLite views, indexes, or whatever.
  • db_path – (optional) specify the filepath of the database to use (or create.)
  • force_db_refresh – overide filehash check on database insert.
  • 'accesslogpath' – filepath of .yaml access log.
  • verbose – set to true to show extra output in the console.

Reading data

DataRegistryUtils.read_tableFunction
read_table(cn::SQLite.DB, data_product::String, [component::String]; data_type=nothing)

SQLite Data Registry helper function. Search and return [HDF5] table data as a DataFrame.

Parameters

  • cn – SQLite.DB object.
  • data_product – data product search string, e.g. "human/infection/SARS-CoV-2/%".
  • component – as above, [required] search string for components names.
DataRegistryUtils.read_estimateFunction
read_estimate(cn::SQLite.DB, data_product::String, [component::String]; data_type=nothing)

SQLite Data Registry helper function. Search TOML-based data resources stored in cn, a SQLite database created previously by a call to fetch_data_per_yaml.

Parameters

  • cn – SQLite.DB object.
  • data_product – data product search string, e.g. "human/infection/SARS-CoV-2/%".
  • component – as above, optional search string for components names.
  • data_type – (optional) specify to return an array of this type, instead of a DataFrame.
DataRegistryUtils.read_data_product_from_fileFunction
read_data_product_from_file(filepath; use_axis_arrays = false, verbose = false)

Read HDF5 or TOML file from local system.

Parameters

  • filepath – the location of an HDF5 or TOML file.
  • use_axis_arrays – convert the output to AxisArrays, where applicable.
  • verbose – set to true to show extra output in the console.

Writing to the Data Registry

Register model

DataRegistryUtils.register_github_modelFunction
register_github_model(model_config, scrc_access_tkn; ... )
register_github_model(model_name, model_version, model_repo, scrc_access_tkn; ... )

Register model as a code_repo_release in the SCRC data registry, from GitHub (default) or another source.

If used, the model_config file should include (at a minimum) the $model_name$, $model_version$ and $model_repo$ fields. Else these can be passed directly to the function.

Parameters

  • model_config – path to the model config .yaml file.
  • model_name – label for the model release.
  • model_version – version number in the format 'n.n.n', e.g. $0.0.1$.
  • model_repo – url of the model [e.g. GitHub] repo.
  • scrc_access_tkn – access token (see https://data.scrc.uk/docs/.)
  • model_description – (optional) description of the model.
  • model_website – (optional) website, e.g. for an accompanying paper, blog, or model documentation.

Register model run

DataRegistryUtils.register_model_runFunction
register_model_run(model_config, code_repo_release_uri, model_run_description, scrc_access_tkn)

Upload model run to the $code_run$ endpoint of the SCRC Data Registry.

Parameters

  • model_config – path to the model config .yaml file.
  • submission_script_text – e.g. 'julia my/julia/code.jl'.
  • code_repo_release_uri – Data Registry uri of the model $code_repo_release$, i.e. the model code such as an (already registered) GitHub repo.
  • model_run_description – description of the model run.
  • scrc_access_tkn – access token (see https://data.scrc.uk/docs/.)

Other

DataRegistryUtils.register_text_fileFunction
register_text_file(text, code_repo_release_uri, model_run_description, scrc_access_tkn, search=true)

Post an entry to the $text_file$ endpoint of the SCRC Data Registry.

Note that according to the docs, "".

Parameters

  • text – text file contents.
  • description – object description.
  • scrc_access_tkn – access token (see https://data.scrc.uk/docs/.)
  • search – (optional, default=true) check for existing entry by path and file hash.
  • hash_val – (optional) specify the file hash, else it is computed based on text.

Index