Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions,
see Getting Started with Amazon Web Services in China
(PDF).
Amazon EMR offers features to help optimize performance when using Spark to query, read and
write data saved in Amazon S3.
S3 Select can
improve query performance for CSV and JSON files in some applications by "pushing down"
processing to Amazon S3.
The EMRFS S3-optimized committer is an alternative to the OutputCommitter class, which uses the multipart uploads feature of EMRFS to
improve performance when writing Parquet files to Amazon S3 using Spark SQL, DataFrames, and
Datasets.