Using Amazon Redshift integration for Apache Spark with Amazon EMR

With Amazon EMR release 6.4.0 and later, every release image includes a connector between Apache Spark and Amazon Redshift. With this connector, you can use Spark on Amazon EMR to process data stored in Amazon Redshift. For Amazon EMR releases 6.4.0 through 6.8.0, the integration is based on the spark-redshift open-source connector. For Amazon EMR releases 6.9.0 and later, the Amazon Redshift integration for Apache Spark has been migrated from the community version to a native integration.

Topics

Launching a Spark application using the Amazon Redshift integration for Apache Spark
Authenticating with Amazon Redshift integration for Apache Spark
Reading and writing from and to Amazon Redshift
Considerations and limitations when using the Spark connector

Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Using Spark with Amazon Kinesis Data Streams

Launch a Spark application