FillWithMode class
The FillWithMode
transform formats a column according to the phone numberformat you specify.
You can also specify tie-breaker logic, where some of the values are identical. For example, consider the following values:
1 2 2 3 3 4
A modeType of MINIMUM
causes FillWithMode
to return 2 as the mode value.
If modeType is MAXIMUM
, the mode is 3.
For AVERAGE
, the mode is 2.5.
Example
from awsglue.context import * from pyspark.sql import SparkSession from awsgluedi.transforms import * sc = SparkContext() spark = SparkSession(sc) input_df = spark.createDataFrame( [ (105.111, 13.12), (1055.123, 13.12), (None, 13.12), (13.12, 13.12), (None, 13.12), ], ["source_column_1", "source_column_2"], ) try: df_output = data_quality.FillWithMode.apply( data_frame=input_df, spark_context=sc, source_column="source_column_1", mode_type="MAXIMUM" ) df_output.show() except: print("Unexpected Error happened ") raise
Output
The output of the given code will be:
``` +---------------+---------------+ |source_column_1|source_column_2| +---------------+---------------+ | 105.111| 13.12| | 1055.123| 13.12| | 1055.123| 13.12| | 13.12| 13.12| | 1055.123| 13.12| +---------------+---------------+ ```
The FillWithMode
transformation from the `awsglue.data_quality` module is applied to
the `input_df` DataFrame. It replaces the `null` values in the source_column_1
column
with the maximum value (`mode_type="MAXIMUM"`) from the non-null values in that column.
In this case, the maximum value in the source_column_1
column is `1055.123`.
Therefore, the `null` values in source_column_1
are replaced by `1055.123`
in the output DataFrame `df_output`.
Methods
__call__(spark_context, data_frame, source_column, mode_type)
The FillWithMode
transform formats the case of strings in a column.
-
source_column
– The name of an existing column. -
mode_type
– How to resolve tie values in the data. This value must be one ofMINIMUM
,NONE
,AVERAGE
, orMAXIMUM
.
apply(cls, *args, **kwargs)
Inherited from GlueTransform
apply.
name(cls)
Inherited from GlueTransform
name.
describeArgs(cls)
Inherited from GlueTransform
describeArgs.
describeReturn(cls)
Inherited from GlueTransform
describeReturn.
describeTransform(cls)
Inherited from GlueTransform
describeTransform.
describeErrors(cls)
Inherited from GlueTransform
describeErrors.
describe(cls)
Inherited from GlueTransform
describe.