What’s new in fal 0.2.0
We’re constantly working on improving fal for our users by looking into issues and feature requests. You can create a Github issue, if you also would like something in fal to be improved. Today we are releasing a new version of fal with a set of features that add flexibility to how you can run Python scripts on your dbt models. New features include:
- Model specification
- Script specification
- Access to dbt test status
- Running scripts before dbt runs
With fal you can specify which scripts should run after which models in a
schema.yml file and then, when you run
fal run, fal will automatically find the models that were recently ran and run Python scripts assigned to them. In order for this to work fal needs access to dbt run artifacts, which are altered with each dbt run. But what if you want to have more control over which models should be run? Or maybe you don't have access to dbt artifacts in a fal runtime and want to trigger a specific run manually. It is now possible to do that with a selection syntax that was rolled out in the latest version of fal. It’s very similar to the dbt selection syntax.
Here's the models section of schema.yml in our example project:
models: - name: boston description: Ozone levels config: materialized: table meta: owner: "@meder" fal: scripts: - fal_scripts/slack.py - fal_scripts/send_datadog_event.py - name: zendesk_ticket_metrics description: Zendesk ticket metrics config: materialized: table meta: owner: "@gorkem" fal: scripts: - fal_scripts/forecast_slack.py - name: lombardia_covid description: Lombardia (IT) Covid19 Cases config: materialized: table meta: owner: "@omer" fal: scripts: - fal_scripts/anomaly_detection.py
As you can see, each model in this schema.yml has different fal scripts associated with them.
select flag lets users choose which models to run fal scripts on:
fal run --select boston fal run -s boston
This example will run fal scripts associated only with
boston model, namely
send_datadog_event.py. This means that the magic variables used in these scripts will refer to the
boston model and the associated data. Specifying a model like this guarantees that fal scripts run regardless of dbt run. You can also select multiple models:
fal run --select boston zendesk_ticket_metrics
models flag works exactly the same way:
fal run --models boston fal run --models boston zendesk_ticket_metrics
As the name suggests, exclude will run all the model scripts except the one specified by this option:
fal run --exclude stg_zendesk_ticket_data
this will work as regular
fal run except the scripts that are associated with
stg_zendesk_ticket_data, which will be ignored.
If you want to persist your custom selections, you can use the
selector option and specify the selection in a selectors.yml file:
fal run --selector my_selector
Find out more about writing your own selector here.
Similar to model specification, sometimes you might want to disregard the scripts defined in
schema.yml and specify which script to run via a command line argument. This is now possible with
fal run --scripts fal_scripts/slack.py
This is will run
fal_scripts/slack.py on all the models that we calculated in the latest dbt run.
You can also specify multiple scripts:
fal run --scripts fal_scripts/slack.py fal_scripts/send_datadog_event.py
It is now possible to specify both models and scripts:
fal run --select boston --scripts fal_scripts/slack.py
dbt test results
In the earlier versions of
fal only the artifacts produced by a
dbt run were taken into account, in version 0.2.0
fal now works with dbt tests:
$ dbt test --select modela $ fal run # runs the scripts that belongs to modela
In the fal script,
context object has now access to test information related to the current model. If the previous dbt command was either
context.current_model.test property is populated with a list of tests:
context.current_model.tests #= [CurrentTest(name='not_null', modelname='boston', column='ds', status='Pass')]
So far, this only works with generic tests that can be associated with dbt models.
--before flag lets users run scripts before their dbt runs. Given the following schema.yml:
models: - name: boston description: Ozone levels config: materialized: table meta: owner: "@meder" fal: scripts: before: - fal_scripts/postgres.py after: - fal_scripts/slack.py
fal run --before will run
fal_scripts/postgres.py script regardless if dbt has calculated the
boston model or not. Regular
fal run will run
fal_scripts/slack.py, but only if
boston model is already calculated by dbt.
It is now possible to run fal with more control. Here’s what you can do now:
- Use the same flags as dbt and specify the models that should be selected or excluded
- Specify scripts
- Keep track of your dbt tests
- Run scripts before your dbt runs with
For examples of fal + dbt usage check out fal repo or join our Discord. And let us know if you have any ideas on how we could further improve fal.