Defining crawlers in Amazon Glue
You can use a crawler to populate the Amazon Glue Data Catalog with tables. This is the primary method used by most Amazon Glue users. A crawler can crawl multiple data stores in a single run. Upon completion, the crawler creates or updates one or more tables in your Data Catalog. Extract, transform, and load (ETL) jobs that you define in Amazon Glue use these Data Catalog tables as sources and targets. The ETL job reads from and writes to the data stores that are specified in the source and target Data Catalog tables.
For more information about using the Amazon Glue console to add a crawler, see Working with crawlers on the Amazon Glue console.
Topics
- Which data stores can I crawl?
- How crawlers work
- Crawler prerequisites
- Crawler properties
- Setting crawler configuration options
- Scheduling an Amazon Glue crawler
- Working with crawlers on the Amazon Glue console
- Accelerating crawls using Amazon S3 event notifications
- Using encryption with the Amazon S3 event crawler
- Parameters set on Data Catalog tables by crawler