Package manual
Reading / downloading data
Process data products
DataRegistryUtils.fetch_data_per_yaml
— Functionfetch_data_per_yaml(yaml_filepath, out_dir = "./out/"; use_axis_arrays::Bool = false, verbose = false, ...)
Refresh and load data products from the SCRC data registry. Checks the file hash for each data product and downloads anew any that are determined to be out-of-date.
Parameters
yaml_filepath
– the location of a .yaml file.out_dir
– the local system directory where data will be stored.use_axis_arrays
– convert the output to AxisArrays, where applicable.use_sql
– load SQLite database and return connection.sql_file
– (optional) SQL file for e.g. custom SQLite views, indexes, or whatever.db_path
– (optional) specify the filepath of the database to use (or create.)force_db_refresh
– overide filehash check on database insert.- 'accesslogpath' – filepath of .yaml access log.
verbose
– set totrue
to show extra output in the console.
Reading data
DataRegistryUtils.read_table
— Functionread_table(cn::SQLite.DB, data_product::String, [component::String]; data_type=nothing)
SQLite Data Registry helper function. Search and return [HDF5] table data as a DataFrame
.
Parameters
cn
– SQLite.DB object.data_product
– data product search string, e.g."human/infection/SARS-CoV-2/%"
.component
– as above, [required] search string for components names.
DataRegistryUtils.read_estimate
— Functionread_estimate(cn::SQLite.DB, data_product::String, [component::String]; data_type=nothing)
SQLite Data Registry helper function. Search TOML-based data resources stored in cn
, a SQLite database created previously by a call to fetch_data_per_yaml
.
Parameters
cn
– SQLite.DB object.data_product
– data product search string, e.g."human/infection/SARS-CoV-2/%"
.component
– as above, optional search string for components names.data_type
– (optional) specify to return an array of this type, instead of a DataFrame.
DataRegistryUtils.read_data_product_from_file
— Functionread_data_product_from_file(filepath; use_axis_arrays = false, verbose = false)
Read HDF5 or TOML file from local system.
Parameters
filepath
– the location of an HDF5 or TOML file.use_axis_arrays
– convert the output to AxisArrays, where applicable.verbose
– set totrue
to show extra output in the console.
Writing to the Data Registry
Register model
DataRegistryUtils.register_github_model
— Functionregister_github_model(model_config, scrc_access_tkn; ... )
register_github_model(model_name, model_version, model_repo, scrc_access_tkn; ... )
Register model as a code_repo_release
in the SCRC data registry, from GitHub (default) or another source.
If used, the model_config
file should include (at a minimum) the $model_name$, $model_version$ and $model_repo$ fields. Else these can be passed directly to the function.
Parameters
model_config
– path to the model config .yaml file.model_name
– label for the model release.model_version
– version number in the format 'n.n.n', e.g. $0.0.1$.model_repo
– url of the model [e.g. GitHub] repo.scrc_access_tkn
– access token (see https://data.scrc.uk/docs/.)model_description
– (optional) description of the model.model_website
– (optional) website, e.g. for an accompanying paper, blog, or model documentation.
Register model run
DataRegistryUtils.register_model_run
— Functionregister_model_run(model_config, code_repo_release_uri, model_run_description, scrc_access_tkn)
Upload model run to the $code_run$ endpoint of the SCRC Data Registry.
Parameters
model_config
– path to the model config .yaml file.submission_script_text
– e.g. 'julia my/julia/code.jl'.code_repo_release_uri
– Data Registry uri of the model $code_repo_release$, i.e. the model code such as an (already registered) GitHub repo.model_run_description
– description of the model run.scrc_access_tkn
– access token (see https://data.scrc.uk/docs/.)
Other
DataRegistryUtils.register_text_file
— Functionregister_text_file(text, code_repo_release_uri, model_run_description, scrc_access_tkn, search=true)
Post an entry to the $text_file$ endpoint of the SCRC Data Registry.
Note that according to the docs, "".
Parameters
text
– text file contents.description
– object description.scrc_access_tkn
– access token (see https://data.scrc.uk/docs/.)search
– (optional, default=true
) check for existing entry by path and file hash.hash_val
– (optional) specify the file hash, else it is computed based ontext
.