FindIncrementalMatches class - Amazon Glue
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

FindIncrementalMatches class

Package: com.amazonaws.services.glue.ml

object FindIncrementalMatches

Def apply

apply(existingFrame: DynamicFrame, incrementalFrame: DynamicFrame, transformId: String, transformationContext: String = "", callSite: CallSite = CallSite("Not provided", ""), stageThreshold: Long = 0, totalThreshold: Long = 0, enforcedMatches: DynamicFrame = null): DynamicFrame, computeMatchConfidenceScores: Boolean

Find matches across the existing and incremental frames and return a new frame with a column containing a unique ID per match group.

  • existingframe — An existing frame which has been assigned a matching ID for each group. Required.

  • incrementalframe — An incremental frame used to find matches against the existing frame. Required.

  • transformId — A unique ID associated with the FindIncrementalMatches transform to apply on the input frames. Required.

  • transformationContext — Identifier for this DynamicFrame. The transformationContext is used as a key for the job bookmark state that is persisted across runs. Optional.

  • callSite — Used to provide context information for error reporting. These values are automatically set when calling from Python. Optional.

  • stageThreshold — The maximum number of error records allowed from the computation of this DynamicFrame before throwing an exception, excluding records present in the previous DynamicFrame. Optional. The default is zero.

  • totalThreshold — The maximum number of total errors records before an exception is thrown, including those from previous frames. Optional. The default is zero.

  • enforcedMatches — The frame for enforced matches. Optional. The default is null.

  • computeMatchConfidenceScores — A Boolean value indicating whether to compute a confidence score for each group of matching records. Optional. The default is false.

Returns a new dynamic frame with a unique identifier assigned to each group of matching records.