Supporting Athena and DuckDB

Supporting Athena and DuckDB

We’re constantly working on making it as easy as possible to integrate fal into our users’ projects. So far we have supported PostgreSQL, Google BigQuery, Snowflake and AWS Redshift. Today we’re adding two more: AWS Athena and DuckDB.

Amazon Athena

Amazon Athena is a query service that allows users to analyze data stored in their S3 buckets. Athena supports standard SQL and has a community supported dbt adapter so you can turn data stored in S3 into dbt models. After some contributions to the dbt-athena adapter, we now support Athena in fal as well.

Given a profiles.yml like this:

fal_test:
  target: dev
  outputs:
    dev:
      type: athena
      s3_staging_dir: s3://my_bucket/my_staging_dir
      region_name: my_region
      schema: my_schema
      database: my_athena_catalog
      num_retries: 3

You can use all the magic functions, including ref and write_to_model, in order to interact with your Athena database from a Python context.

DuckDB

DuckDB is an in-process DB that lets you query Parquet and CSV files using SQL. It designed to support analytical queries without setting up a database instance and has been gaining popularity. The newest release of fal supports DuckDB out of the box. In order to use it, you must already have dbt-duckdb adapter installed, and have a profiles.yml file that looks like this:

fal_test:
  target: dev
  outputs:
    dev:
      type: duckdb
      path: path_to_my_duckdb

fal will then run scripts against DuckDB and make all the magic functions and variables available, with the exception of write_to_source.