Launching a Spark application using the Amazon Redshift integration for Apache Spark
For Amazon EMR releases 6.4 through 6.9, you must use the --jars
or --packages
option to specify which of the following JAR
files you want to use. The --jars
option specifies dependencies
stored locally, in HDFS, or using HTTP/S. To see other file locations supported by
the --jars
option, see Advanced Dependency Management--packages
option specifies dependencies stored in the
public Maven repo.
-
spark-redshift.jar
-
spark-avro.jar
-
RedshiftJDBC.jar
-
minimal-json.jar
Amazon EMR releases 6.10.0 and higher don't require the minimal-json.jar
dependency, and automatically install the other dependencies to each cluster by
default. The following examples show how to launch a Spark application with the
Amazon Redshift integration for Apache Spark.