Parametrization#
One of the most powerful features of the workflow is the ability to parametrize and tune its execution. Each sub-workflow (PH and ML) constrains a small set of locations where run parameters can be specified.
Input.yml files#
These files, placed within the workflow
folders of each sub-workflow, contain the pure MadMiner parameters,
as those values are directly injected into MadMiner Python invocations. These files are loaded upon the launch of
a Makefile
command, so there is no need to rebuild the workflow Docker image upon change.
They are grouped by the workflow step in which they get injected, although the specification of a certain parameter value on one step, may affect the definition of a different parameter in another step further-down the process.
For the full documentation about what these parameters represent, and what values can they take, please, visit the MadMiner documentation.
Makefile rules#
When running locally, via Yadage, there is a small set of parameters being defined at the Makefile
level.
These parameters can be found within the yadage-run
Makefile rule, and they affect either one of the sub-workflows:
In the case of the Physics sub-workflow:
input_file
: path to the input.yml file to use.num_procs_per_job
: number of parallel processes for the collision event generation.
In the case of the ML sub-workflow:
data_file
: path to the Physics sub-workflow events generated file.input_file
: path to the input.yml file to use.mlflow_args_s
: arguments to override the default sampling values.mlflow_args_t
: arguments to override the default training values.mlflow_args_e
: arguments to override the default evaluation values.
In order to override the default MLFlow-specific arguments (specified on the MLproject file),
change the mlflow_args_
parameters. For example, to override the MLFlow parameters of the sampling step:
yadage-run:
@yadage-run ... \
-p ... \
-p mlflow_args_s="test_split=0.5 nuisance_flag=0" \
-p ...
Reana.yml file#
When running the workflow on REANA (which is only available for the complete workflow), the same parameters that are
made available over the yadage-run
Makefile rule, are made available on the reana.yml
file within the
reana folder.
However, it is important to understand that some of these parameters (i.e. mlflow_server
and mlflow_user
),
are directly provided by the reana-run
Makefile rule, and must not be filled within the reana.yml
file.
In short, they must be left empty.