ExperimentsManager

This Julia package provides utility functions to easily run simple workflows (sequences of functions) over a whole parameter space. It will generate anonymous functions on the fly and therefore probably has an overhead compared to writing a simple script, but is useful to quickly compare in one line multiple methods with different sets of parameters.

By default, each atomic operation (executing the workflow on one point of the parameter space) is executed locally, but an abstract layer allows to easily dispatch them (e.g. on a cluster).

Limitations

For now, the module is mostly useful for workflow whose output can be stored in a Dict object and dumped to a CSV file (hence, preferably real-valued outputs). Of course this could easily be extended, but it has been used only for this purpose so far.

Features using directly the command line and which are potentially not compatible with all systems:

  • detection of the number of results when launching an experiment for which partial results exist on the disk (both with the local and Igrida backends)

Extensions

To run the actual code of the experiments remotely, one can define a new backend by subtyping AbstractBackend. Have a look for instance at the file backends.jl.

Usage

The main function is run(exp, kwargs) which executes all the workflows corresponding to the Experiment object exp, passing kwargs to the backend. Here is a typical exemple of usage (description below).

using ExperimentsManager
# include other packages required to run your code

e = ExperimentsManager.Exp(; name = "my_exp", iterations = 100)
dict = logdict(e) # A dictionary to "log" all useful results.

e.workflow = [
	(:X, [generate_dataset => PrmSpace([
			:prm1 => 10 .^range(0, 3, length=gridres), # Run the workflow for each of these values
			:prm2 => Int[2],
		], KwArgs(
			:kwarg1 => [true],
			:log_dict => [dict],
		)),
	]),
	(:S, [my_method_one => PrmSpace([
		 	:X => [WorkflowResult(:X)], # Run on the result of the previous function
		 	:prm1 => [WorkflowPrm(:X, :prm1)], # Reuse the value of prm1 used to compute X
		], KwArgs(
			:log_dict => [dict],
			:kwarg1 => [:SomeOption, :SomeOtherOption],
		)),
		my_method_two => PrmSpace([
		 	:X => [WorkflowResult(:X)], # Run on the result of the previous function
		]), # This one has no other params
	]),
	(:R, [evaluate_results => PrmSpace([
		 	:X => [WorkflowResult(:X)],
			:S => [WorkflowResult(:S)],
			:log_dict => [dict],
		])
	])
]
# Here are extra parameters for the backend, can be useful e.g. when
# running the experiments on a cluster. Leave empty if you don't know.
backend_kwargs = Dict( :memGB => 12.0, :hours => 8, )
df = ExperimentsManager.run(e, backend_kwargs)

This workflow is made of one function to generate data, two functions that we would like to run on this data, and one function to evaluate the result (here imaginary names are used for illustration). Using the above syntax, the last line of the script will run all possible methods using all possible parameters: in this case, it would thus run my_method_one with kwarg1 set to :SomeOption, my_method_one with kwarg1 set to SomeOtherOption and my_method_two on each of the datasets (i.e. for each value of prm1)). All results will be stored in CSV files (one file per combination of parameters, one line iteration).

A parameter space is represented by a PrmSpace object, which roughly is just the combination of an array corresponding to the positional argumenst, and a dictionary for the keyword arguments. Each argument is represented by a pair :argument_name => [values]. (For positional arguments, the argument name is not required nor used, but can be used to improve readability.)

Dependent arguments

Sometimes, it might be useful to use a keyword arguments only if some condition is satisfied. Although we do not support yet arbitrary conditions, it is at least possible to use the following syntax:

KwArgs(:Arg1 => [:Arg1Choice1, 
				 :Arg1Choice2 => KwArgs(:Arg2 => [:Arg2Choice1, :Arg2Choice2]),

Then the parameter :Arg2 is only used if :Arg1 is set to :Arg1Choice2.

If dictionaries/parameter spaces must themselves be provided in the nested argument, one can use the function expand_kwargs to explicitly expand them.