REPLACE_OUTLIERS
Updates the data point values that classify as outliers, based on the settings in the parameters.
Parameters

sourceColumn
– Specifies the name of an existing numeric column that might contain outliers. 
outlierStrategy
– Specifies the approach to use in detecting outliers. Valid values include the following:
Z_SCORE
– Identifies a value as an outlier when it deviates from the mean by more than the standard deviation threshold. 
MODIFIED_Z_SCORE
– Identifies a value as an outlier when it deviates from the median by more than the median absolute deviation threshold. 
IQR
– Identifies a values as an outlier when it falls beyond the first and last quartile of column data. The interquartile range (IQR) measures where the middle 50% of the data points are.


threshold
– Specifies the threshold value to use when detecting outliers. ThesourceColumn
value is identified as an outlier if the score that's calculated with theoutlierStrategy
exceeds this number. The default is 3. 
replaceType
– Specifies the method to use when replacing outliers. Valid values include the following:
WINSORIZE_VALUES
– Specifies using the minimum and maximum percentile to cap the values. 
REPLACE_WITH_CUSTOM

REPLACE_WITH_EMPTY

REPLACE_WITH_NULL

REPLACE_WITH_MODE

REPLACE_WITH_AVERAGE

REPLACE_WITH_MEDIAN

REPLACE_WITH_SUM

REPLACE_WITH_MAX


modeType
– Indicates the type of modal function to use whenreplaceType
isREPLACE_WITH_MODE
. Valid values include the following:MIN
,MAX
, andAVERAGE
. 
minValue
– Indicates the minimum percentile value for the outlier range that is to be applied whentrimValue
is used. Valid range is 0–100. 
maxValue
– Indicates the maximum percentile value for the outlier range that is to be applied whentrimValue
is used. . Valid range is 0–100. 
value
– Specifies the value to insert when usingREPLACE_WITH_CUSTOM
. 
trimValue
– Specifies whether to remove all or some of the outliers. This Boolean value is set toTRUE
whenreplaceType
isREPLACE_WITH_NULL
,REPLACE_WITH_MODE
, orWINSORIZE_VALUES
. It defaults toFALSE
for all others.
FALSE
– Removes all outliers 
TRUE
–Removes outliers that rank outside of the percentile cap threshold specified inminValue
andmaxValue
.

The following examples display syntax for a single RecipeAction operation. A recipe contains at least one RecipeStep operation, and a recipe step contains at least one recipe action. A recipe action runs the data transform that you specify. A group of recipe actions run in sequential order to create the final dataset.