Amazon EMR 7.7.0 - Hadoop release notes
Amazon EMR 7.7.0 - Hadoop changes
Type | Description |
---|---|
New Feature |
Optimize S3A GlobStatus Call With S3 Prefix Listing |
Backport |
YARN-7327 |
Backport |
YARN-10058 |
Backport |
YARN-11732 |
Backport |
YARN-11560 |
Backport |
YARN-11191 |
Backport |
YARN-11041 |
Backport |
YARN-11660 |
Backport |
HADOOP-19116 |
Backport |
HADOOP-19115 |
Backport |
HADOOP-19024 |
Backport |
HADOOP-19123 |
Backport |
HADOOP-19114 |
Backport |
HADOOP-19237 |
New Feature |
Add S3 request auditing to S3A |
Backport |
HADOOP-17609 |
Backport |
HADOOP-18583 |
New Feature |
Add support for S3A Role Mappings |
Amazon EMR 7.7.0 - Hadoop features
-
Asynchronous container scheduling is made the default scheduling strategy for the capacity scheduler, designed to optimize container allocation speed.
S3A filesystem introduces an optimization for glob status calls using S3 prefix listing to accelerate list operations. By default, this feature is disabled and can be enabled by configuring
fs.s3a.prefix.listing.in.glob.status.enabled=true
in the core-site.xml file. When enabled, the optimization allows server-side filtering for globstatus calls likefs.globstatus("s3://
, improving list performance by by listing only the objects starting withbucket
/a*")"a"
.Add S3 request auditing to S3A, when enabled the information from fileSystemOwner object is used to populate the userAgent string with the user and user group fields making the S3 requests.
S3A adds support for Role mappings which helps determine which IAM role to use based on users, groups, or S3 prefixes.