Generating column statistics
Follow these steps to manage statistics generation in the Data Catalog using Amazon Glue console or Amazon CLI.
- Console
-
To generate column statistics using the console
-
Sign in to the Amazon Glue console at https://console.amazonaws.cn/glue/
. -
Choose Data Catalog tables.
-
Choose a table from the list.
-
Choose Generate statistics under Actions menu.
You can also choose Generate statistics button under Column statistics tab in the lower section of the Tables page.
-
On the Generate statistics page, specify the following options:
-
Table (all columns) – Choose this option to generate statistics for all columns in the table.
-
Selected columns – Choose this option to generate statistics for specific columns. You can select the columns from the drop-down list.
-
All rows – Choose all rows from the table to generate accurate statistics.
-
Sample rows – Choose only a specific percent of rows from the table to generate statistics. The default is all rows. Use the up and down arrows to increase or decrease the percent value.
Note
We recommend to include all rows in the table to compute accurate statistics. Use sample rows to generate column statistics only when approximate values are acceptable.
-
-
(Optional) Next, choose a security configuration to enable at-rest encryption for logs.
-
Choose Generate statistics to run the process.
-
- Amazon CLI
-
In the following example, replace values for
DatabaseName
,TableName
, andColumnNameList
with actual database, table, and column names. Replace account ID with a valid Amazon Web Services account, and role name with the name of the IAM role that you're using to generate statistics.aws glue start-column-statistics-task-run --input-cli-json file://input.json { "DatabaseName": "
<test-db>
", "TableName": "<test-table>
", "ColumnNameList": [ "<column1>
", "<column2>
", ], "Role": "arn:aws:iam::<123456789012>
:role/<Stats-Role>
", "SampleSize": 10.0 }You can generate column statistics also by calling the StartColumnStatisticsTaskRun operation.