Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions,
see Getting Started with Amazon Web Services in China
(PDF).
Use the EMRFS S3-optimized
committer
The EMRFS S3-optimized committer is an alternative OutputCommitter implementation that is optimized for writing files to
Amazon S3 when using EMRFS. The EMRFS S3-optimized committer improves application
performance by avoiding list and rename operations done in Amazon S3 during job and task
commit phases. The committer is available with Amazon EMR release 5.19.0 and later, and
is enabled by default with Amazon EMR 5.20.0 and later. The committer is used for Spark
jobs that use Spark, DataFrames, or Datasets. Starting with Amazon EMR 6.4.0, this
committer can be used for all common formats including parquet, ORC, and text-based
formats (including CSV and JSON). For releases prior to Amazon EMR 6.4.0, only the
Parquet format is supported. There are circumstances under which the committer is
not used. For more information, see Requirements for the EMRFS
S3-optimized committer.