FLAG_DUPLICATE_ROWS - Amazon Glue DataBrew
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

FLAG_DUPLICATE_ROWS

Returns a new column with a specified value in each row that indicates whether that row is an exact match of an earlier row in the dataset. When matches are found, they are flagged as duplicates. The initial occurrence is not flagged, because it doesn't match an earlier row.

Parameters
  • trueString – Value to be inserted if the row matches an earlier row.

  • falseString – Value to be inserted if the row is unique.

  • targetColumn – Name of the new column that is inserted in the dataset.

Example

{ "RecipeAction": { "Operation": "FLAG_DUPLICATE_ROWS", "Parameters": { "trueString": "TRUE", "falseString": "FALSE", "targetColumn": "Flag" } } }