Hilary Oliver - NIWA - 15 Nov 2018, Dallas Texas
Not currently well suited to "data-intensive" workflows
no magic sauce to obscure "what the scientist does"
A workflow is primarily a configuration of the workflow engine, and a config file is easier for most users and most use cases than programming to a Python API. However, ...!
ETC.: event handling, checkpointing, extreme restart, ...
A powerful "unified" CLI
$ cylc --help
e.g. to re-trigger all failed
tasks with name
get_*
and cycle point 2020*
, in suite
expt1
(leaving others alone):
$ cylc trigger expt1 2020*/get_*:failed
note dynamic filtering vs static SMS GUI
# Hello World! Plus
[scheduling]
[[dependencies]]
graph = "hello => farewell & goodbye"
# Hello World! Plus
[scheduling]
[[dependencies]]
graph = "hello => farewell & goodbye"
[runtime]
[[hello]]
script = echo "Hello World!"
# Hello World! Plus
[scheduling]
[[dependencies]]
graph = "hello => farewell & goodbye"
[runtime]
[[hello]]
script = echo "Hello World!"
[[[environment]]]
# ...
[[[remote]]]
host = hpc1.niwa.co.nz
[[[job]]]
batch system = PBS
# ...
# ...
# ...
#!Jinja2
{% set SAY_BYE = false %}
[scheduling]
[[dependencies]]
graph = """hello
{% if SAY_BYE %}
=> goodbye & farewell
{% endif %}
"""
[runtime]
# ...
#!Jinja2
{% set SAY_BYE = true %}
[scheduling]
[[dependencies]]
graph = """hello
{% if SAY_BYE %}
=> goodbye & farewell
{% endif %}
"""
[runtime]
# ...
[[dependencies]]
graph = "pre => sim => post => done"
[[dependencies]]
graph = "pre => sim<M> => post<M> => done"
# with M = 1..5
[[dependencies]]
graph = "prep => init => sim => post => close => done"
[[dependencies]]
graph = "prep => init<R> => sim<R,M> => post<R,M> => close<R> => done"
# with M = a,b,c; and R = 1..3
[cylc]
cycle point format = %Y-%m
[scheduling]
initial cycle point = 2010-01
[[dependencies]]
[[[R1]]] # R1/^/P1M
graph = "prep => foo"
[cylc]
cycle point format = %Y-%m
[scheduling]
initial cycle point = 2010-01
[[dependencies]]
[[[R1]]]
graph = "prep => foo"
[[[P1M]]] # R/^/P1M
graph = """
foo[-P1M] => foo
foo => bar & baz => qux
"""
[cylc]
cycle point format = %Y-%m
[scheduling]
initial cycle point = 2010-01
[[dependencies]]
[[[R1]]]
graph = "prep => foo"
[[[P1M]]]
graph = """
foo[-P1M] => foo
foo => bar & baz => qux
"""
[[[R2/^+P2M/P1M]]]
graph = "baz & qux[-P2M] => boo"
cylc review
- job log viewer (NEW)event handling: includes built-in aggregated emails
robust inter-workflow triggering: via suite DB not server (important for transient distributed suites)
production tested: recovery from hall failures
research - production: also involves other aspects: configurable workflow definitions (switch bits on and off) - but primarily, don't need to run clock-limited in research.
(duplicated config is a maintenance risk)
(*) caveat: software dependencies and PyGTK; proper pip and conda packaging soon
ease of use: note academic community and ESiWACE support
lights-out operation since 2011; 25 inter-dependent model suites (X 2)
This is a technical necessity, to survive into the exascale era!
An outline of some potential pathways for future development
It's hard to incorporate a module into a workflow
Ideally we would write dependencies to/from the module itself rather than the tasks within it
Workflows could be represented as tasks
foo => baz => module<p> => pub
Python > Jinja2
Illustrative examples Python could provide Cylc:
bar = cylc.Task('myscript')
cylc.run(
foo >> bar >> baz
)
Use Python data structures as Cylc parameters:
animal = cylc.Parameter({
'cat': {'lives': 9, 'memory': 2},
'dog': {'lives': 1, 'memory': 10}
})
baz = cylc.TaskArray('run-baz',
args=('--animal', animal),
env={'N_LIVES': animal['lives']}),
directives={'--mem': animal['memory']}
)
Use Python to write Cylc modules:
import my_component
graph = cylc.graph(
foo >> bar >> my_component >> baz,
my_component.pub >> qux
)
foo => bar => baz
foo:
out: a
bar:
in: a
out: b
baz:
in: a, b
out: c
Cylc can currently scale to tens of thousands of tasks and dependencies
But there are limitations, for example:
Many to many triggers result in NxM dependencies
Cylc should be able to represent this as a single dependency
The scheduling algorithm currently iterates over a "pool" of tasks.
We plan to re-write the scheduler using an event driven approach.
This should make Cylc more efficient and flexible model solving problems like this.
Working towards a leaner Cylc we plan to separate the codebase into a Kernel - Shell model
Shell |
Kernel |
User Commands | Scheduler |
Suite Configuration | Job Submission |
Combining multiple jobs to run in a single job submission.
A lightweight Cylc kernel could be used to execute a workflow within a job submission.