Skip to main content

MetaFlow Features

Special Features For Metaflow

nbdoc is an open source project developed at Outerbounds, with the goal of having high quality documentation for Metaflow. Therefore, it should be no surprise that there are some special features made specifically for Metaflow.

First, consider this basic Flow:

myflow.py
from metaflow import FlowSpec, step


class MyFlow(FlowSpec):
@step
def start(self):
self.some_data = ["some", "data"]
self.next(self.middle)

@step
def middle(self):
self.next(self.end)

@step
def end(self):
pass


if __name__ == "__main__":
MyFlow()

If you were to run this script with the magic command

! python run myflow.py

You would normally get output that looks like this:

Metaflow 2.5.3 executing MyFlow for user:hamel
Validating your flow...
The graph looks good!
Running pylint...
Pylint is happy!
2022-03-24 11:06:46.769 Workflow starting (run-id 1648145206766542):
2022-03-24 11:06:46.776 [1648145206766542/start/1 (pid 81929)] Task is starting.
2022-03-24 11:06:47.549 [1648145206766542/start/1 (pid 81929)] Task finished successfully.
2022-03-24 11:06:47.557 [1648145206766542/middle/2 (pid 81932)] Task is starting.
2022-03-24 11:06:48.371 [1648145206766542/middle/2 (pid 81932)] Task finished successfully.
2022-03-24 11:06:48.379 [1648145206766542/end/3 (pid 81935)] Task is starting.
2022-03-24 11:06:49.133 [1648145206766542/end/3 (pid 81935)] Task finished successfully.
2022-03-24 11:06:49.134 Done!

However, nbdoc automatically detects and cleans up the output to remove extreanous information, which looks like this (see the rendered page)

python myflow.py run
 Workflow starting (run-id 1658245963346133):
[1658245963346133/start/1 (pid 2317)] Task is starting.
[1658245963346133/start/1 (pid 2317)] Task finished successfully.
[1658245963346133/middle/2 (pid 2321)] Task is starting.
[1658245963346133/middle/2 (pid 2321)] Task finished successfully.
[1658245963346133/end/3 (pid 2324)] Task is starting.
[1658245963346133/end/3 (pid 2324)] Task finished successfully.
Done!

You can also choose to only show certain steps from your Flow with the meta:show_steps=<step1_name>,<step2_name> comment. The cell input looks like this. Note that the comment is stripped out and only the "middle" step is showing

#meta:show_steps=middle
!python myflow.py run --run-id-file run_id.txt
tip

If you want to interact with a Flow, we recommend using the --run-id-file <filemame> flag.

python myflow.py run --run-id-file run_id.txt
...
[1658245966396075/middle/2 (pid 2344)] Task is starting.
[1658245966396075/middle/2 (pid 2344)] Task finished successfully.
...

You can retrieve results from your flow like this:

run_id = !cat run_id.txt
from metaflow import Run

run = Run(f"MyFlow/{run_id[0]}")

run.data.some_data
['some', 'data']

It is often smart to run tests in your docs. To do this, simply add assert statements. These will get tested automatically when we run the test suite.

assert run.data.some_data == ["some", "data"]
assert run.successful