Parameters set on Data Catalog tables by crawler - Amazon Glue
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Parameters set on Data Catalog tables by crawler

These table properties are set by Amazon Glue crawlers. We expect users to consume the classification and compressionType properties. Other properties, including table size estimates, are used for internal calculations, and we do not guarantee their accuracy or applicability to customer use cases. Changing these parameters may alter the behavior of the crawler, we do not support this workflow.

Property key Property value
UPDATED_BY_CRAWLER

Name of crawler performing update.

connectionName

The name of the connection in the Data Catalog for the crawler used to connect the to the data store.

recordCount

Estimate count of records in table, based on file sizes and headers.

skip.header.line.count

Rows skipped to skip header. Set on tables classified as CSV.

CrawlerSchemaSerializerVersion

For internal use

classification

Format of data, inferred by crawler. For more information about data formats supported by Amazon Glue crawlers see Built-in classifiers in Amazon Glue.

CrawlerSchemaDeserializerVersion

For internal use

sizeKey

Combined size of files in table crawled.

averageRecordSize

Average size of row in table, in bytes.

compressionType

Type of compression used on data in the table. For more information about compression types supported by Amazon Glue crawlers see Built-in classifiers in Amazon Glue.

typeOfData

file, table or view.

objectCount

Number of objects under Amazon S3 path for table.

These additional table properties are set by Amazon Glue crawlers for Snowflake data stores.

Property key Property value
aws:RawTableLastAltered

Records the last altered timestamp of the Snowflake table.

ViewOriginalText

View SQL statement.

ViewExpandedText

View SQL statement encoded in Base64 format.

ExternalTable:S3Location

Amazon S3 location of the Snowflake external table.

ExternalTable:FileFormat

Amazon S3 file format of the Snowflake external table.

These additional table properties are set by Amazon Glue crawlers for JDBC-type data stores such as Amazon Redshift, Microsoft SQL Server, MySQL, PostgreSQL, and Oracle.

Property key Property value
aws:RawType

When a crawler store the data in the Data Catalog it translates the datatypes to Hive-compatible types, which many times causes the information on the native datatype to be lost. The crawler outputs the aws:RawType parameter to provide the native-level datatype.

aws:RawColumnComment

If a comment is associated with a column in the database, the crawler outputs the corresponding comment in the catalog table. The comment string is truncated to 255 bytes.

Comments are not supported for Microsoft SQL Server.

aws:RawTableComment

If a comment is associated with a table in the database, the crawler outputs corresponding comment in the catalog table. The comment string is truncated to 255 bytes.

Comments are not supported for Microsoft SQL Server.