Launching a Spark application with the Amazon Redshift integration for Apache Spark - Amazon EMR
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Launching a Spark application with the Amazon Redshift integration for Apache Spark

To use the integration with EMR Serverless 6.9.0, you must pass the required Spark-Redshift dependencies with your Spark job. Use --jars to include Redshift connector related libraries. To see other file locations supported by the --jars option, see the Advanced Dependency Management section of the Apache Spark documentation.

  • spark-redshift.jar

  • spark-avro.jar

  • RedshiftJDBC.jar

  • minimal-json.jar

Amazon EMR releases 6.10.0 and higher don't require the minimal-json.jar dependency, and automatically install the other dependencies to each cluster by default. The following examples show how to launch a Spark application with the Amazon Redshift integration for Apache Spark.

Amazon EMR 6.10.0 +

Launch a Spark job on Amazon EMR Serverless with the Amazon Redshift integration for Apache Spark on EMR Serverless release 6.10.0 and higher.

spark-submit my_script.py
Amazon EMR 6.9.0

To launch a Spark job on Amazon EMR Serverless with the Amazon Redshift integration for Apache Spark on EMR Serverless release 6.9.0, use the --jars option as shown in the following example. Note that the paths listed with the --jars option are the default paths for the JAR files.

--jars /usr/share/aws/redshift/jdbc/RedshiftJDBC.jar, /usr/share/aws/redshift/spark-redshift/lib/spark-redshift.jar, /usr/share/aws/redshift/spark-redshift/lib/spark-avro.jar, /usr/share/aws/redshift/spark-redshift/lib/minimal-json.jar
spark-submit \ --jars /usr/share/aws/redshift/jdbc/RedshiftJDBC.jar,/usr/share/aws/redshift/spark-redshift/lib/spark-redshift.jar,/usr/share/aws/redshift/spark-redshift/lib/spark-avro.jar,/usr/share/aws/redshift/spark-redshift/lib/minimal-json.jar \ my_script.py