Introducing fal Python Models

Introducing fal Python Models
Screenshot of a dbt lineage graph with a .py node in the middle

fal enables you to run Python with your dbt project. Until today, fal users could associate their dbt models with Python scripts to run before or after a model, but these scripts did not generate new assets on their own. We had recently added the ability to run these Python scripts in the middle of a dbt DAG. Today we are taking a step towards enabling Python-only models anywhere in your dbt project: Python Data Models.

To start using Python Data Models, just add a Python (or Jupyter Notebook) file to your models directory in your dbt project. In this file, you can use the familiar ref and source functions to pull data from your existing dbt models. These functions also help fal automatically generate the correct DAG dependencies. Every Python file that creates a data model must now include a write_to_model call at the end to write back to the data warehouse.

from utils import make_forecast

# `ref` and `source` get picked up as dependencies automatically
orders = ref('orders_daily')

# prepare for forecast function
df_count = df_count.rename(columns={
  "order_date": "ds", 
  "order_count": "y"
})
df_forecast_count = make_forecast(df_count, periods=50)

write_to_model(df_forecast_count)

Then run  fal flow run --experimental-models:

❯ fal flow run --experimental-models --select orders_forecast+

File 'fal/order_detailed_cluster.sql' was generated from 'order_detailed_cluster.py'.
Please do not modify it directly. We recommend committing it to your repository.
File 'fal/orders_forecast.sql' was generated from 'orders_forecast.py'.
Please do not modify it directly. We recommend committing it to your repository.

20:40:14  Found 12 models, 20 tests, 0 snapshots, 0 analyses, 191 macros, 0 operations, 4 seed files, 0 sources, 0 exposures, 0 metrics

Executing command: dbt --log-format json run --project-dir ./jaffle_shop_with_fal --select orders_forecast
Running with dbt=1.1.0
Found 12 models, 20 tests, 0 snapshots, 0 analyses, 191 macros, 0 operations, 4 seed files, 0 sources, 0 exposures, 0 metrics
Concurrency: 10 threads (target='dev')
Finished running  in 2.08s.

16:40:22 | Starting fal run for following models and scripts:
(model: models/orders_forecast.py)
Concurrency: 10 threads
20:40:23  Unable to do partial parsing because config vars, config profile, or config target have changed

Executing command: dbt --log-format json run --project-dir ./jaffle_shop_with_fal --select orders_forecast_filter
Running with dbt=1.1.0
Unable to do partial parsing because config vars, config profile, or config target have changed
Found 12 models, 20 tests, 0 snapshots, 0 analyses, 191 macros, 0 operations, 4 seed files, 0 sources, 0 exposures, 0 metrics
Concurrency: 10 threads (target='dev')
1 of 1 START table model dbt_matteo.orders_forecast_filter ..................... [RUN]
1 of 1 OK created table model dbt_matteo.orders_forecast_filter ................ [CREATE TABLE (119.0 rows, 6.0 KB processed) in 2.90s]
Finished running 1 table model in 5.46s.
Completed successfully
Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
Running orders_forecast.py and downstream SQL model orders_forecast_filter
🔗
Check out the complete example in fal-ai/jaffle_shop_with_fal

Under the hood, fal generates ephemeral dbt models as SQL files for dbt to recognize the models and add them to the DAG. This is mainly for Python Data Models to appear in dbt docs and play well with other dbt commands.

dbt docs page showing Python model being recognized and partial lineage graph
dbt docs page of a Python Data Model. Notice the models/fal directory

Python Data Models is now our recommended way to build Python transformations that need to write data to the data warehouse. If you have been using after scripts, here is our guide to move to Python Data Models.

You can find out more about fal by reading the docs and our blog. fal is open source, so you can star us on GitHub. We also have a Discord server that everyone is welcome to join.