v0.4 — moving forward

v0.4 — moving forward
DALL-E generated image

Today we are releasing fal v0.4.0, which introduces some backwards-incompatible changes. The focus of this release is to make our experimental features generally available, simplifying fal flow run .

Embracing experimentation (#463, #462, #470)

If you have been using fal flow run, you probably want Python scripts running in the middle of your dbt models. We introduced this some time ago with the fal flow --experimental-flow flag, followed by the recent Python Data models with fal flow --experimental-models. We have heard from many of our users that these two features are their default, so we are making them the default behavior from now on!

# Running model_e, a Python model without any flags!
❯ fal flow run --select model_e
File 'fal/staging/model_c.sql' was generated from 'staging/model_c.py'.
Please do not modify it directly. We recommend committing it to your repository.
File 'fal/model_e.sql' was generated from 'model_e.ipynb'.
Please do not modify it directly. We recommend committing it to your repository.

23:09:22  Found 6 models, 0 tests, 0 snapshots, 0 analyses, 166 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics

Running command: dbt run --threads 1 --project-dir .../008_pure_python_models --select model_e
23:09:26  Running with dbt=1.0.8
...

23:09:27 | Starting fal run for the following models and scripts:
(model: models/model_e.ipynb) # It works for Notebooks too!
Example run of fal fow run where Python Data model model_e.ipynb is run
This is what model_e.ipynb looks like. It’s part of our integration tests.

Type annotations in your fal scripts (#466, #469, #472)

Example of adding fal.typing import

By adding the line from fal.typing import * to your fal scripts, you will get correct typing (and no more errors in VS Code) for context, write_to_model and other fal native functions.

Dropping dbt v0.20 and v0.21 support (#464)

Starting Dec 3, 2021, dbt Labs dropped v0.* support. Based on our usage data, most fal users (i.e. 99%) are now on dbt v1.*. Since dbt-core introduced significant changes in their codebase for the v1.0 release, dropping v0.* support makes adding those improvements to fal much easier for our team.

Hooks always run with their model (#386, #471)

We have had some requests to always run after-scripts or before-scripts of a model when it is run. This was not the default behavior, so we introduced post-hook and pre-hook. This is the new recommended way to add scripts around your dbt (or Python) models. If the model runs, the hooks will run too!

Change from scriptsafter to post-hooks
# Now the Slack message is sent whenever the customers model is run
❯ fal flow run --select customers
...
23:39:44  Found 11 models, 20 tests, 0 snapshots, 0 analyses, 166 macros, 0 operations, 3 seed files, 0 sources, 0 exposures, 0 metrics

Running command: dbt run --threads 1 --project-dir .../002_jaffle_shop --select customers
23:39:49  Running with dbt=1.0.8
...
23:39:50  1 of 1 START view model dbt_fal.fal_002_customers............................... [RUN]
23:39:50  1 of 1 OK created view model dbt_fal.fal_002_customers.......................... [CREATE VIEW in 0.13s]
23:39:50
...

23:39:50 | Starting fal run for following models and scripts:
(customers, send_slack_message.py)

Sending slack message
Example of running a single model with --select and a post-hook triggering with it

The after-scripts and before-scripts are deprecated now, and we would like your feedback to know if this means you would be missing them or if hooks cover your needs! You can always open an issue, message us in the #tools-fal dbt Slack channel or in Discord (you can even send us an email!).

We are thinking of ways of adding non-data-generating models, that means a Python script that is part of the DAG but does not create a table in the data warehouse. If you need something similar, leave us a comment letting us know about your use-case and how you would imagine it working.

New threading (#403, #406 , #407, #408)

You may notice that now the dbt calls are done 1 model at a time and with --threads 1. Don’t worry, we are not making things slower! Since Python scripts are also running in parallel, we wanted to maximize starting to work on the next node of your DAG as soon as possible. This should make your fal flow run faster in general (some use-cases may not benefit.)

This great contribution was introduced by our newly arrived, @isidentical!

Star us on GitHub if you enjoyed this post. Reach out through Discord or #tools-fal in the dbt Slack.


Other fixes and improvements

Bug Fixes

  • Change typing file generating tool to generate fal.typing module (#469)
  • Handle partially-applied write_to_model before scripts in typing module (#472)
  • Ensure logbook doesn't crash on startup (#477)
  • Move away from multiprocessing pool on executions  (#478)
  • Accept keyword arguments in not_allowed_function for hooks (#480)
  • Wrap BigQuery jobs in exception handler and fix tests (#479)

Documentation

  • Mention about post-hooks (#468)
  • Compatibility document (#467)
  • Change <>s to & (#473)

Features

  • Offer type annotations for exposed functions. (#466)
  • Make before/after scripts happen right before/after models (#460)
  • Implement pre-hooks (#471)

Miscellaneous Tasks

  • Remove warning about mutiple select flags being passed (#465)
  • Remove experimental flags (#470)
  • Remove support for dbt 0.X (#464)

Refactor

  • Make fal run use new threading (#462)
  • Use experimental threads, flow and models mode by default (#463)

Testing

  • Add support for pre-hooks on the DAG verifier (#474)
  • Add support for before/after scripts to the DAG verifier. (#475)