Launching a Spark application using the Amazon Redshift integration for Apache Spark
For Amazon EMR releases 6.4 through 6.9, you must use the --jars
or --packages option to specify which of the following JAR
files you want to use. The --jars option specifies dependencies
stored locally, in HDFS, or using HTTP/S. To see other file locations supported by
the --jars option, see Advanced Dependency Management--packages option specifies dependencies stored in the
public Maven repo.
-
spark-redshift.jar -
spark-avro.jar -
RedshiftJDBC.jar -
minimal-json.jar
Amazon EMR releases 6.10.0 and higher don't require the minimal-json.jar
dependency, and automatically install the other dependencies to each cluster by
default. The following examples show how to launch a Spark application with the
Amazon Redshift integration for Apache Spark.