Google Firestore is a cloud-hosted NoSQL database that can be directly accessed by mobile and web applications through native SDKs. It has a rich set of use cases, including keeping track of inventories, user sessions and state mutations. Firestore is fast and enable low-latency querying of large quantities of data. While dbt is very good at transforming data in a data warehouse, sometimes we need this data to be available in a low-latency manner. One way to achieve this is by loading warehouse data to Firestore in a reverse ETL.
In this post we look at how to send data from a dbt model straight to Firestore. Fal lets you run Python code on top of your dbt models, so we will use fal to send model data to Firestore in just two lines of Python.
If you want to learn more about fal, see here.
Here's a script that gets data from a dbt model and sends it straight to Firestore:
df = ref("zendesk_ticket_data") write_to_firestore(df=df, collection="zendesk_ticket_data", key_column="id")
write_to_firestore are provided by fal, there's no need to import them.
ref downloads a dbt model into memory as a pandas DataFrame. In this case, we are downloading a table for a model named
write_to_firestore takes the downloaded DataFrame, the name of a target Firestore collection, and the name of a key column in the DataFrame. A key column has to hold a unique identifier of every entry. For example, in case of a model containing a list of unique users, the key column would refer to user ids.
Save this script somewhere as
Connect script to a dbt model
Install fal if you haven't already:
pip install fal
We can go to a dbt project directory and find a relevant schema.yml file. We then choose a model that we want to send to Firestore. Finally, we add a fal meta tag and connect our Python script to the chosen model:
models: - name: zendesk_ticket_data description: Zendesk ticket data config: materialized: table meta: fal: scripts: - path/to/to_firestore.py
zendesk_ticket_data model in this example is now connected to
to_firestore.py and can be set to run every time
zendesk_ticket_data is re-calculated.
In the command line, run dbt:
Followed by fal:
If everything is setup correctly, Firestore should be updated with a new collection:
You can now send dbt model data to Firestore. What's next? Fal lets you run any Python script in your dbt project. Maybe you want send a Slack message or make writing to Firestore a part of your CI/CD pipeline. Check out the fal repository for more info, see an example dbt project that uses fal and join our Discord server.