

本文属于机器翻译版本。若本译文内容与英语原文存在差异，则一律以英文原文为准。

# 如何使用 SageMaker 人工智能 CatBoost
如何使用 CatBoost

你可以用 CatBoost 作 Amazon A SageMaker I 的内置算法。以下部分介绍如何与 SageMaker Python 开发工具包 CatBoost 配合使用。有关如何 CatBoost 从 Amazon SageMaker Studio 经典用户界面中使用的信息，请参阅[SageMaker JumpStart 预训练模型](studio-jumpstart.md)。
+ ** CatBoost 用作内置算法**

  使用 CatBoost 内置算法构建 CatBoost 训练容器，如以下代码示例所示。你可以使用 SageMaker A `image_uris.retrieve` I API（如果使用 A [maz SageMaker on Python SDK](https://sagemaker.readthedocs.io/en/stable) 版本 2 则使用 `get_image_uri` API）自动发现 CatBoost内置算法图像 URI。

  指定 CatBoost 图像 URI 后，您可以使用 CatBoost 容器使用 SageMaker AI Estimator API 构造估算器并启动训练作业。 CatBoost 内置算法在脚本模式下运行，但训练脚本是为你提供的，无需替换。如果您在使用脚本模式创建 SageMaker 训练作业方面有丰富的经验，则可以整合自己的 CatBoost 训练脚本。

  ```
  from sagemaker import image_uris, model_uris, script_uris
  
  train_model_id, train_model_version, train_scope = "catboost-classification-model", "*", "training"
  training_instance_type = "ml.m5.xlarge"
  
  # Retrieve the docker image
  train_image_uri = image_uris.retrieve(
      region=None,
      framework=None,
      model_id=train_model_id,
      model_version=train_model_version,
      image_scope=train_scope,
      instance_type=training_instance_type
  )
  
  # Retrieve the training script
  train_source_uri = script_uris.retrieve(
      model_id=train_model_id, model_version=train_model_version, script_scope=train_scope
  )
  
  train_model_uri = model_uris.retrieve(
      model_id=train_model_id, model_version=train_model_version, model_scope=train_scope
  )
  
  # Sample training data is available in this bucket
  training_data_bucket = f"jumpstart-cache-prod-{aws_region}"
  training_data_prefix = "training-datasets/tabular_multiclass/"
  
  training_dataset_s3_path = f"s3://{training_data_bucket}/{training_data_prefix}/train"
  validation_dataset_s3_path = f"s3://{training_data_bucket}/{training_data_prefix}/validation"
  
  output_bucket = sess.default_bucket()
  output_prefix = "jumpstart-example-tabular-training"
  
  s3_output_location = f"s3://{output_bucket}/{output_prefix}/output"
  
  from sagemaker import hyperparameters
  
  # Retrieve the default hyperparameters for training the model
  hyperparameters = hyperparameters.retrieve_default(
      model_id=train_model_id, model_version=train_model_version
  )
  
  # [Optional] Override default hyperparameters with custom values
  hyperparameters[
      "iterations"
  ] = "500"
  print(hyperparameters)
  
  from sagemaker.estimator import Estimator
  from sagemaker.utils import name_from_base
  
  training_job_name = name_from_base(f"built-in-algo-{train_model_id}-training")
  
  # Create SageMaker Estimator instance
  tabular_estimator = Estimator(
      role=aws_role,
      image_uri=train_image_uri,
      source_dir=train_source_uri,
      model_uri=train_model_uri,
      entry_point="transfer_learning.py",
      instance_count=1,
      instance_type=training_instance_type,
      max_run=360000,
      hyperparameters=hyperparameters,
      output_path=s3_output_location
  )
  
  # Launch a SageMaker Training job by passing the S3 path of the training data
  tabular_estimator.fit(
      {
          "training": training_dataset_s3_path,
          "validation": validation_dataset_s3_path,
      }, logs=True, job_name=training_job_name
  )
  ```

  有关如何设置 CatBoost 为内置算法的更多信息，请参阅以下笔记本示例。
  + [使用 Amazon A SageMaker I LightGBM 和算法进行表格分类 CatBoost ](https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/lightgbm_catboost_tabular/Amazon_Tabular_Classification_LightGBM_CatBoost.ipynb)
  + [使用 Amazon A SageMaker I LightGBM 和算法进行表格回归 CatBoost ](https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/lightgbm_catboost_tabular/Amazon_Tabular_Regression_LightGBM_CatBoost.ipynb)