Start guide#
The concept#
Woom helps you perform tasks in isolated environments, in a given order, optionally cycling through dates and ensemble members, on your laptop or on an HPC with a scheduler.
Here are some definitions.
A task consists of:
a job script generated by the workflow
submission arguments, if any, for submission to a scheduler
a list of dependent jobs whose successful execution conditions the start of the current job.
A job script is a bash file containing:
a line for trap termination signals
a block that declares the environment
a line to change the directory
a block of commands that do the main job
A block to check that expected artifacts were created
a exit command that outputs any trapped signal or 0.
See also
To set up your workflow:
Create a directory dedicated to your workflow.
Configure your tasks in the
tasks.cfgfile, in particular their execution content and submission specifications.Define the necessary environments, directories and scheduler specifications in the
host.cfgfile.Configure your workflow in the
workflow.cfgfile, in particular the parameters for generating the job script, the cycle and ensemble specifications and the order in which tasks are submitted through the stages.Add additional material such as the
binandlibdirectories, aextextension directory, or other useful files that you can access at runtime using theworkflow_dirsubstitution parameter or theWOOM_WORKFLOW_DIRenvironment variable.
A typical structure of the workflow directory is the following:
workflow/
├── workflow.cfg # mandatory
├── tasks.cfg # mandatory
├── hosts.cfg # mandatory
├── ext/ # optional, woom extensions
│ ├── jinja_filters.py
│ └── validator_functions.py
├── bin/ # optional, prepended to $PATH in the job script
│ └── myscript.py
└── lib/
└── python # optional, prepended to $PYTHONPATH in the job script
└── mylib.py
You can add more stuff to this directory and access it using the {{ workflow_dir }} template
in configuration files or the WOOM_WORKFLOW_DIR environment variable.
Configurations#
Tasks with tasks.cfg#
This file helps you configure tasks:
Their content with the environment name (declared in the
hosts.cfgfile), the run directory, the shell command line(s) to be executed and the exit signals to trap.The files that are expected to be created by the task and that are named artifacts.
Their submission arguments when using a scheduler, like the queue and the resources.
See the configobj specifications for this configuration.
In the following example, four tasks with arbitrary names are specified in the configuration file.
The command lines use jinja patterns such as {{ data_dir }}, which are filled with entries from both the [params] section of the workflow.cfg file and the default entries provided by the workflow (Input context).
Some of the tasks here use an environment named “prepost”, that must be declared in the hosts.cfg configuration file.
tasks.cfg#[clean_data_dir]
[[content]]
commandline="rm -rf {{ data_dir }}/*.nc"""
[fetch_data]
[[content]]
commandline="fetch-data.py ---out {{ data_dir }}/data.nc -box {{ ','.join(box) }} {{ cycle.begin_date }} {{ cycle.end_date }}"
env="prepost"
[cp_config]
[[content]]
commandline="cp $MODEL_CONFIG/{{ app_name }}/{{ app_conf }}/config.yml {{ data_dir }}"
[run_exe]
[[content]]
commandline="""run.exe {{ data_dir }}/data.nc out.nc"""
env="run"
[[submit]]
queue=omp
[plot_results]
[[content]]
content="plot-results.py {{ workflow.get_run_dir('run_exe', cycle) }}/out.nc"
env="prepost"
Hosts with hosts.cfg#
This file helps you configure hosts:
The name patterns to guess the host from names.
The scheduler, where “background” means “submitted in background”.
A few commands.
A list of environments with their name and specifications that describe environment modules and variables, or a conda environment to load.
See the configobj specifications for this configuration.
This example file declares the resources available on the datarmor host, in particular its scheduler, the scratch dir taken from the SCRATCH environment variable and the name of the seq queue.
An environment called prepost is declared using environment modules and environment variables.
hosts.cfg#[datarmor]
patterns=datavisu*,*.ice.ifremer.fr,datarmor*
scheduler=pbspro
module_setup=. /etc/profile.d/modules.sh
[[queues]]
seq=sequentiel
omp=omp
[[dirs]]
scratch=$SCRATCH
[[envs]]
[[[prepost]]]
[[[[modules]]]]
use=$HOME/Modulefiles
load=conda/latest
[[[[vars]]]]
forward=MODEL_CONFIGS
[[[[[prepend]]]]]
PYTHONPATH=$DATAWORK/lib/python
The default hosts.cfg declares the local host that matches any computer by default. When a user provides their own hosts file, this one is merged with the default file. The user must use the local to extend the configuration of the default host.
hosts.cfg#[local]
patterns=*
scheduler=background
Workflow with workflow.cfg#
This file helps you configure the workflow:
Your application specifications: name, configuration and experiment. It is optional but highly recommended.
The way you want to cycle over dates. It is also optional.
The specifications of your ensemble when you want to iterate over members.
The additional configuration parameters that will be used to declare environment variables and format task command lines with jinja substitutions when generating the job scripts.
The workflow graph through stages that defines in which order to execute the tasks as defined in
tasks.cfg.Groups of tasks that must be run sequentially in the workflow.
See the configobj specifications for this configuration.
In this example, we give our application a name, specify which data to loop over and declare the box and data_dir parameters, which can be used in the tasks.cfg file.
The clean_data_dir task is executed only once and before the looping over dates because it is called in the [prolog] stage.
Other tasks are run sequentially for each date interval, except fetch_data and cp_config which are run in parallel since they are executed in parallel since they are called in the same sequence named fetch.
workflow.cfg#[app]
name=myapp
config=myconf3.0
[cycles]
begin_date=2020-01-01
end_date=2020-01-02
freq=6h
[params]
box=-15,-5,35,45
data_dir={{ scratch_dir }}/{{ app_name }}
[stages]
[[prolog]]
clean=clean_data_dir
[[cycles]]
fetch=fetch_data,cp_config
run=run_script
plot=plot_results
Job script generation#
The path to the job script is {submission_dir}/job.sh.
The script is first exported and rendered with Jinja as a string by the woom.tasks.Task.render_content() method, which contains Jinja patterns. See Jinja rendering.
The rendering is performed by woom.render.render() using a dictionary created by the woom.workflow.Workflow.get_task_inputs() method.
See “Input context” to see its default content.
This dictionary is specific to a given task, at a given cycle, and for a given ensemble member.
Trapped exit signals#
Trapping the signal allows the job to return an exit status other than zero in the event of an error.
The exit status is stored in {submission_dir}}/job.status and is interpreted by the workflow to know the status of the job.
Environment#
The environment we need is specified by its name in the task configuration and is detailed in the host configuration. It typically takes the form of environment module directives and environment variable declarations.
Run directory#
It is specified in the task configuration and defaults to {scratch_dir}/{task_path}.
You can use the scratch_dir and work_dir host configuration options, or any other input parameter.
Command lines#
The bash lines are the core of what the task does. They are configured in the task configuration and rendered as bash lines thanks to the powerful Jinja templating system (see: Jinja rendering).
Exit#
Any exit signal that occurs is stored in {submission_dir}/job.status.
This signal is then issued by the exit command.
Finally#
The standard output is saved in {submission_dir}/job.out and
the standard error into {submission_dir}/job.err.
Jinja rendering#
Jinja is a package that allows advanced template rendering. See its website for detailed explanations. It is used to generate the job scripts using template files and parameters.
The default templates are detailed in the Jinja templates section.
The user can extend these templates by providing its own job.sh and env.sh template files in the templates/ directory of its workflow directory.
Jinja perform substitutions thanks to a Context instance that is a dictionary containing the useful objects for a given task, a given cycle and a given member, as explained in the Input context section.
Template Filling#
Woom can automatically fill Jinja2 templates to generate configuration files, namelists, or scripts before task execution.
Configure in the [[fill]] section of your task:
[run_model]
[[fill]]
[[[namelist]]]
template = ocean.nml.j2
destination = {{ task_run_dir }}/ocean.nml
Templates use the same context variables as job scripts ({{ cycle.begin_date }}, {{ params.timestep }}, etc.) and are stored in the templates/ directory.
See also
Templating In-Depth for complete guide and woom fill for manual filling
Artifacts#
Artifacts are files that are expected to be generated by a given task. They have two usages:
The job must fail by default if the artifact paths are not present at the end of the task job.
One task can access the artifacts of any other task given the task name, the artifact name and the context (cycle, member).
Artifacts are declared in the tasks.cfg as subsections of the [[artifacts]] section of given task:
The section name is the short name of the artifact.
The
pathsis a single or a list of relative or absolute paths, function names, and is always returned as a list. When it is a functionn name, this function must be registered inartifacts_generatorsextension to generate file paths.The
checkoption tells if all paths must be checked for their existence at the end of the job.The
callableoption tells if the paths must a interpreted as function name to generate paths.
[download_clim]
[[content]
...
[[artifacts]]
[[[clim_file]]]
paths={{ task_run_dir }}/clim.c
Warning
All artifacts must ultimately be able to be converted to an absolute path. So you must either declare an artifact with an absolute path, prepend it with a directory mapping like {{ task_run_dir }} or provide a relative path and fill the run_dir option of a task.
To make reference to an artifact in a task, there are two cases:
If the artifact belongs to the current task, do like:
ncdump -h {{ task.artifacts["clim_file"][0] }}
If the artifact belongs to another task, do like:
ncdump -h {{ workflow.get_task_artifact_paths("clim_file", "download_clim")[0] }}
To list all artifacts, expected or generated, use the woom show artifacts command line function.
$ woom show artifacts
Please have a look at this example.
Controlling and running the workflow#
Run all woom commands from the workflow directory. See the Examples of configuration section for more illustrative examples.
Tip
All woom commands support the --help option
First, make sure that your workflow is well interpreted:
$ woom show overview
Then, run your workflow in dry (fake) and debug mode:
$ woom --logger-level debug run --dry-run
Then, run it in normal mode if everything is ok:
$ woom run
To check the status of all jobs, especially on an HPC with a scheduler:
$ woom show status
To kill jobs:
$ woom kill # all jobs
$ woom kill 1264 # one job
$ woom kill --task fetch_data # identified by task name