Schedule data quality monitoring jobs
After you create your baseline, you can call the
create_monitoring_schedule()
method of your
DefaultModelMonitor
class instance to schedule an hourly data
quality monitor. The following sections show you how to create a data quality
monitor for a model deployed to a real-time endpoint as well as for a batch
transform job.
Important
You can specify either a batch transform input or an endpoint input, but not both, when you create your monitoring schedule.
Data quality monitoring for models deployed to real-time endpoints
To schedule a data quality monitor for a real-time endpoint, pass your
EndpointInput
instance to the endpoint_input
argument of your DefaultModelMonitor
instance, as shown in the
following code sample:
from sagemaker.model_monitor import CronExpressionGenerator data_quality_model_monitor = DefaultModelMonitor( role=sagemaker.get_execution_role(), ... ) schedule = data_quality_model_monitor.create_monitoring_schedule( monitor_schedule_name=schedule_name, post_analytics_processor_script=s3_code_postprocessor_uri, output_s3_uri=s3_report_path, schedule_cron_expression=CronExpressionGenerator.hourly(), statistics=data_quality_model_monitor.baseline_statistics(), constraints=data_quality_model_monitor.suggested_constraints(), schedule_cron_expression=CronExpressionGenerator.hourly(), enable_cloudwatch_metrics=True, endpoint_input=EndpointInput( endpoint_name=endpoint_name, destination="/opt/ml/processing/input/endpoint", ) )
Data quality monitoring for batch transform jobs
To schedule a data quality monitor for a batch transform job, pass your
BatchTransformInput
instance to the
batch_transform_input
argument of your
DefaultModelMonitor
instance, as shown in the following code
sample:
from sagemaker.model_monitor import CronExpressionGenerator data_quality_model_monitor = DefaultModelMonitor( role=sagemaker.get_execution_role(), ... ) schedule = data_quality_model_monitor.create_monitoring_schedule( monitor_schedule_name=mon_schedule_name, batch_transform_input=BatchTransformInput( data_captured_destination_s3_uri=s3_capture_upload_path, destination="/opt/ml/processing/input", dataset_format=MonitoringDatasetFormat.csv(header=False), ), output_s3_uri=s3_report_path, statistics= statistics_path, constraints = constraints_path, schedule_cron_expression=CronExpressionGenerator.hourly(), enable_cloudwatch_metrics=True, )