Going from dbt models to pandas DataFrames
One of our favorite features of dbt is the ability to build dependencies between data models using the
ref function. With
ref, you can build pipelines all in SQL, simply by referring to other tables in your
But what if you want to refer to these models in a Python script?
fal lets you use an analogous
ref function in Python to refer to dbt models and download them as pandas DataFrames. Effectively,
fal lets you mix in Python scripts into your dbt DAG and therefore enables you to do things that SQL alone cannot. You can now make forecasts on your models, or detect anomalies, or do sentiment analysis.
fal also handles the data warehouse connection and authentication out of the box, by tapping into the dbt authentication layer. No need to setup a dedicated Python client for BigQuery or Snowflake. You set up your connections once for dbt and that's it.
The latest version of fal is available on pip:
pip install fal
fal lets you connect Python scripts to dbt models in a
schema.yml file by using a
models: - name: my_model_name description: My model description config: materialized: table meta: fal: scripts: - my_python_script.py
scripts tag in
schema.yml expects a list of Python scripts with paths relative to the dbt project root directory.
Inside your Python script
The Python script connected to your dbt model has access to a set of magic functions and variables, such as
ref. It is the
ref function that lets us download an entire dbt model as a pandas DataFrame:
my_model_df = ref(context.current_model.name)
context is a fal magic variable that allows you to access the dbt run context. For a full list of what is available in the context, check out our docs.
If you would like to download a model to a notebook for ad-hoc analysis, you can also do that!
Run fal scripts
The fal Python scripts in a dbt project are run by the
fal run command from your terminal. But, not all of the scripts get run by default, only the ones associated with the models that were calculated in the most recent dbt run. Therefore, before running Python, run dbt:
and the fal Python scripts will run.
You can now download dbt models as pandas DataFrames. What's next? Maybe you want make a forecast or do a reverse ETL to Firestore? Check out the fal repository for more info, see our docs, see an example dbt project that uses fal and join our Discord server.