Trusted Identity Propagation with Amazon Glue ETL
With IAM Identity Center, you can connect to identity providers (IdPs) and centrally manage access for users and groups across Amazon analytics services. You can integrate identity providers such as Okta, Ping, and Microsoft Entra ID (formerly Azure Active Directory) with IAM Identity Center for users in your organization to access data using a single-sign on experience. IAM Identity Center also supports connecting additional third-party identity providers.
With Amazon Glue 5.0 and higher, you can propagate user-identities from IAM Identity Center to Amazon Glue interactive sessions. Amazon Glue Interactive Sessions will further propagate supplied identity to downstream services such as Amazon S3 Access Grants, Amazon Lake Formation, and Amazon Redshift, enabling secure data access via user identity in these downstream services.
Overview
Identity Center
Trusted identity propagation
Features and benefits
The Amazon Glue interactive sessions integration with IAM Identity Center
Trusted identity propagation
The ability to enforce table-level authorization and fine grained access control with Identity Center identities on Lake Formation managed Amazon Glue data catalog tables.
The ability to enforce authorization with Identity Center identities on Amazon Redshift clusters.
Enables end to end tracking of user actions for auditing.
The ability to enforce Amazon S3 prefix-level authorization with Identity Center identities on Amazon S3 Access Grants-managed Amazon S3 prefixes.
Use cases
Interactive Data Exploration and Analysis
Data engineers use their corporate identities to seamlessly access and analyze data across multiple Amazon accounts. Through SageMaker Studio, they launch interactive Spark sessions via Amazon Glue ETL, connecting to various data sources including Amazon S3 and the Amazon Glue Data Catalog. As engineers explore datasets, Spark enforces fine-grained access controls defined in Lake Formation based on their identities, ensuring they can only view authorized data. All queries and data transformations are logged with the user's identity, creating a clear audit trail. This streamlined approach enables rapid prototyping of new analytics products while maintaining strict data governance across client environments.
Data Preparation and Feature Engineering
Data scientists from multiple research teams collaborate on complex projects using a unified data platform. They log into SageMaker Studio with their corporate credentials, immediately accessing a vast, shared data lake that spans multiple Amazon accounts. As they begin feature engineering for new machine learning models, Spark sessions launched through Amazon Glue ETL enforce Lake Formation's column and row-level security policies based on their propagated identities. Scientists can efficiently prepare data and engineer features using familiar tools, while compliance teams have assurance that every data interaction is automatically tracked and audited. This secure, collaborative environment accelerates research pipelines while maintaining the strict data protection standards required in regulated industries.
How it works

A user logs into client-facing applications (SageMaker AI, or custom applications) using their corporate identity through IAM Identity Center. This identity is then propagated through the entire data access pipeline.
The authenticated user launches Amazon Amazon Glue Interactive Sessions, which serve as the compute engine for data processing. These sessions maintain the user's identity context throughout the workflow.
Amazon Lake Formation and the Amazon Glue Data Catalog work together to enforce fine-grained access controls. Lake Formation applies security policies based on the user's propagated identity, while Amazon S3 Access Grant provides additional permission layers, ensuring users can only access data they're authorized to view.
Finally, the system connects to Amazon S3 Storage where the actual data resides. All access is governed by the combined security policies, maintaining data governance while enabling interactive data exploration and analysis. This architecture enables secure, identity-based data access across multiple Amazon services while maintaining a seamless user experience for data scientists and engineers working with large datasets.
Integrations
Amazon managed development environment
The following Amazon managed client-facing applications support trusted identity propagation with Amazon Glue interactive sessions:
Sagemaker Unified Studio
To use trusted identity propagation with Sagemaker Unified Studio:
Set up Sagemaker Unified Studio project with trusted identity propagation enabled as the client-facing development environment.
Set up Lake Formation
to enable fine-grained access control for Amazon Glue tables based on the user or group in IAM Identity Center. Set up Amazon S3 Access Grants
to enable temporary access to the underlying data locations in Amazon S3. Open Sagemaker Unified Studio JupyterLab IDE space and select Amazon Glue as compute for notebook execution.
Customer managed self-hosted Notebook environment
To enable trusted identity propagation for users of custom-developed applications, see
Access Amazon services programmatically using trusted identity propagation