本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。
为 Amazon EMR 配置托管扩展
以下各节介绍如何启动使用托管扩展的 EMR 集群 Amazon Web Services Management Console Amazon SDK for Java、或。 Amazon Command Line Interface
使用配置 Amazon Web Services Management Console 托管扩展
您可以在创建集群时使用 Amazon EMR 控制台配置托管式扩缩,也可更改正在运行的集群的托管式扩缩策略。
使用配置 Amazon CLI 托管扩展
创建集群时,您可以使用 Amazon EMR 的 Amazon CLI 命令来配置托管扩展。您可以使用速记语法 (可在相关命令中指定内联 JSON 配置)。也可以引用包含配置 JSON 的文件。您也可以将托管扩展策略应用于现有集群,并删除以前应用的托管扩展策略。此外,您可以从正在运行的集群中检索扩展策略配置的详细信息。
在集群启动期间启用托管扩展
您可以在集群启动期间启用托管扩展,如以下示例所示。
aws emr create-cluster \ --service-role EMR_DefaultRole \ --release-label emr-7.1.0 \ --name EMR_Managed_Scaling_Enabled_Cluster \ --applications Name=Spark Name=Hbase \ --ec2-attributes KeyName=keyName,InstanceProfile=EMR_EC2_DefaultRole \ --instance-groups InstanceType=m4.xlarge,InstanceGroupType=MASTER,InstanceCount=1 InstanceType=m4.xlarge,InstanceGroupType=CORE,InstanceCount=2 \ --region us-east-1 \ --managed-scaling-policy ComputeLimits='{MinimumCapacityUnits=2,MaximumCapacityUnits=4,UnitType=Instances}'
使用时,也可以使用--managed-scaling-policy 选项指定托管策略配置create-cluster
。
将托管扩展策略应用于现有集群
您可以将托管扩展策略应用于现有集群,如以下示例所示。
aws emr put-managed-scaling-policy --cluster-id
j-123456
--managed-scaling-policy ComputeLimits='{MinimumCapacityUnits=1
, MaximumCapacityUnits=10
, MaximumOnDemandCapacityUnits=10
, UnitType=Instances
}'
也可以使用 aws emr put-managed-scaling-policy
命令将托管扩展策略应用于现有集群。以下示例使用对 JSON 文件 managedscaleconfig.json
的引用,该文件指定托管扩展策略配置。
aws emr put-managed-scaling-policy --cluster-id
j-123456
--managed-scaling-policy file://./managedscaleconfig.json
以下示例显示 managedscaleconfig.json
文件的内容,该文件定义托管扩展策略。
{ "ComputeLimits": { "UnitType": "
Instances
", "MinimumCapacityUnits":1
, "MaximumCapacityUnits":10
, "MaximumOnDemandCapacityUnits":10
} }
检索托管扩展策略配置
GetManagedScalingPolicy
命令检索策略配置。例如,以下命令检索集群 ID 为 j-123456
的集群的配置。
aws emr get-managed-scaling-policy --cluster-id
j-123456
该命令生成以下示例输出。
{ "ManagedScalingPolicy": { "ComputeLimits": { "MinimumCapacityUnits":
1
, "MaximumOnDemandCapacityUnits":10
, "MaximumCapacityUnits":10
, "UnitType": "Instances" } } }
有关在中使用 Amazon EMR 命令的更多信息 Amazon CLI,请参阅。https://docs.amazonaws.cn/cli/latest/reference/emr
删除托管扩展策略
RemoveManagedScalingPolicy
命令可删除策略配置。例如,以下命令删除集群 ID 为 j-123456
的集群的配置。
aws emr remove-managed-scaling-policy --cluster-id
j-123456
用于配置 Amazon SDK for Java 托管扩展
以下程序摘要说明如何使用 Amazon SDK for Java配置托管扩展:
package com.amazonaws.emr.sample; import java.util.ArrayList; import java.util.List; import com.amazonaws.AmazonClientException; import com.amazonaws.auth.AWSCredentials; import com.amazonaws.auth.AWSStaticCredentialsProvider; import com.amazonaws.auth.profile.ProfileCredentialsProvider; import com.amazonaws.regions.Regions; import com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduce; import com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduceClientBuilder; import com.amazonaws.services.elasticmapreduce.model.Application; import com.amazonaws.services.elasticmapreduce.model.ComputeLimits; import com.amazonaws.services.elasticmapreduce.model.ComputeLimitsUnitType; import com.amazonaws.services.elasticmapreduce.model.InstanceGroupConfig; import com.amazonaws.services.elasticmapreduce.model.JobFlowInstancesConfig; import com.amazonaws.services.elasticmapreduce.model.ManagedScalingPolicy; import com.amazonaws.services.elasticmapreduce.model.RunJobFlowRequest; import com.amazonaws.services.elasticmapreduce.model.RunJobFlowResult; public class CreateClusterWithManagedScalingWithIG { public static void main(String[] args) { AWSCredentials credentialsFromProfile = getCreadentials("AWS-Profile-Name-Here"); /** * Create an Amazon EMR client with the credentials and region specified in order to create the cluster */ AmazonElasticMapReduce emr = AmazonElasticMapReduceClientBuilder.standard() .withCredentials(new AWSStaticCredentialsProvider(credentialsFromProfile)) .withRegion(Regions.US_EAST_1) .build(); /** * Create Instance Groups - Primary, Core, Task */ InstanceGroupConfig instanceGroupConfigMaster = new InstanceGroupConfig() .withInstanceCount(1) .withInstanceRole("MASTER") .withInstanceType("m4.large") .withMarket("ON_DEMAND"); InstanceGroupConfig instanceGroupConfigCore = new InstanceGroupConfig() .withInstanceCount(4) .withInstanceRole("CORE") .withInstanceType("m4.large") .withMarket("ON_DEMAND"); InstanceGroupConfig instanceGroupConfigTask = new InstanceGroupConfig() .withInstanceCount(5) .withInstanceRole("TASK") .withInstanceType("m4.large") .withMarket("ON_DEMAND"); List<InstanceGroupConfig> igConfigs = new ArrayList<>(); igConfigs.add(instanceGroupConfigMaster); igConfigs.add(instanceGroupConfigCore); igConfigs.add(instanceGroupConfigTask); /** * specify applications to be installed and configured when Amazon EMR creates the cluster */ Application hive = new Application().withName("Hive"); Application spark = new Application().withName("Spark"); Application ganglia = new Application().withName("Ganglia"); Application zeppelin = new Application().withName("Zeppelin"); /** * Managed Scaling Configuration - * Using UnitType=Instances for clusters composed of instance groups * * Other options are: * UnitType = VCPU ( for clusters composed of instance groups) * UnitType = InstanceFleetUnits ( for clusters composed of instance fleets) **/ ComputeLimits computeLimits = new ComputeLimits() .withMinimumCapacityUnits(1) .withMaximumCapacityUnits(20) .withUnitType(ComputeLimitsUnitType.Instances); ManagedScalingPolicy managedScalingPolicy = new ManagedScalingPolicy(); managedScalingPolicy.setComputeLimits(computeLimits); // create the cluster with a managed scaling policy RunJobFlowRequest request = new RunJobFlowRequest() .withName("EMR_Managed_Scaling_TestCluster") .withReleaseLabel("emr-7.1.0") // Specifies the version label for the Amazon EMR release; we recommend the latest release .withApplications(hive,spark,ganglia,zeppelin) .withLogUri("s3://path/to/my/emr/logs") // A URI in S3 for log files is required when debugging is enabled. .withServiceRole("EMR_DefaultRole") // If you use a custom IAM service role, replace the default role with the custom role. .withJobFlowRole("EMR_EC2_DefaultRole") // If you use a custom Amazon EMR role for EC2 instance profile, replace the default role with the custom Amazon EMR role. .withInstances(new JobFlowInstancesConfig().withInstanceGroups(igConfigs) .withEc2SubnetId("subnet-123456789012345") .withEc2KeyName("my-ec2-key-name") .withKeepJobFlowAliveWhenNoSteps(true)) .withManagedScalingPolicy(managedScalingPolicy); RunJobFlowResult result = emr.runJobFlow(request); System.out.println("The cluster ID is " + result.toString()); } public static AWSCredentials getCredentials(String profileName) { // specifies any named profile in .aws/credentials as the credentials provider try { return new ProfileCredentialsProvider("AWS-Profile-Name-Here") .getCredentials(); } catch (Exception e) { throw new AmazonClientException( "Cannot load credentials from .aws/credentials file. " + "Make sure that the credentials file exists and that the profile name is defined within it.", e); } } public CreateClusterWithManagedScalingWithIG() { } }