Using dbt-core with EMR Serverless
With Amazon EMR release emr-7.13.0 and later, you can use dbt-core with
Spark Connect enabled interactive sessions
To use dbt-core with EMR Serverless
-
Install the dbt-spark adapter with session support.
pip install dbt-spark[session] -
In your dbt profile (
profiles.yml), set thehostconfig asNA. This field is required but ignored whenSPARK_REMOTEis set.emrs_spark_sample: target: dev outputs: dev: type: spark method: session schema: sample_schema host: NA -
Start an interactive session and set
SPARK_REMOTEto the session endpoint URL before running dbt. For more information about how to get the session endpoint URL, see Run interactive sessions with Amazon EMR Serverless through Spark Connect.import os os.environ['SPARK_REMOTE'] =spark_remote_url -
Run dbt commands against the interactive session on the EMR Serverless application.
dbt run --selectmy_dbt_model