Machine learning measurements - Amazon Glue
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Machine learning measurements

To understand the measurements that are used to tune your machine learning transform, you should be familiar with the following terminology:

True positive (TP)

A match in the data that the transform correctly found, sometimes called a hit.

True negative (TN)

A nonmatch in the data that the transform correctly rejected.

False positive (FP)

A nonmatch in the data that the transform incorrectly classified as a match, sometimes called a false alarm.

False negative (FN)

A match in the data that the transform didn't find, sometimes called a miss.

For more information about the terminology that is used in machine learning, see Confusion matrix in Wikipedia.

To tune your machine learning transforms, you can change the value of the following measurements in the Advanced properties of the transform.

  • Precision measures how well the transform finds true positives among the total number of records that it identifies as positive (true positives and false positives). For more information, see Precision and recall in Wikipedia.

  • Recall measures how well the transform finds true positives from the total records in the source data. For more information, see Precision and recall in Wikipedia.

  • Accuracy measures how well the transform finds true positives and true negatives. Increasing accuracy requires more machine resources and cost. But it also results in increased recall. For more information, see Accuracy and precision in Wikipedia.

  • Cost measures how many compute resources (and thus money) are consumed to run the transform.