Scaling types

Written by

Updated at February 15, 2024

Manually scaled groups
Automatically scaled groups

When creating each instance group, you will need to choose its scaling type, which determines whether the number of instances in the group will change automatically or manually.

Note

If you pause processes (switch them to the PAUSED status) in an instance group, it will not scale up.

Manually scaled groups

You can create fixed-size instance groups and manage their size manually based on your current computing needs.

Automatically scaled groups

When creating an automatically scaled instance group, you specify the target metric value, while the service continuously re-adjusts the number of instances:

If the average metric value rises above the target, Instance Groups will create new instances in the group.
If the average value decreases below the target value with a smaller group, Instance Groups will delete unnecessary instances.

This is done to ensure that the average metric value within the same availability zone or the entire group (depending on the automatic scaling type) does not differ much from the target value.

For example, let's assume there are 4 instances in an availability zone with an average metric value of 70 and target value of 80. Instance Groups will not reduce the group size, because as you delete an instance, the average value will be larger than the target one: 4 × 70 / 3 = 93.3. When the average value drops to 60, Instance Groups will delete one instance since the average value does not surpass the target: 4 × 60 / 3 = 80.

If multiple metrics are specified in the settings, the largest estimated instance group size is used.

For automatically scaled groups, you need to specify common scaling settings and metric settings.

Type of automatic scaling

Instance Groups can adjust the number of instances separately in each availability zone specified in the group settings or in the entire instance group:

With zonal scaling, Instance Groups will calculate an average metric value for scaling and the required number of instances for each availability zone. This type of automatic scaling is used by default.
With regional scaling, the metric value and the number of instances are calculated for the entire group. To change the group auto scaling type to regional, specify the auto_scale scaling policy with the auto_scale_type: REGIONAL key.

General settings

To reduce adjustment sensitivity, with Instance Groups, you can configure:

Stabilization period: After the number of VMs increases, the group size will not decrease until the end of a stabilization period, even if the average value of the metric has become sufficiently low.
Warm-up period: Period during which the VM, upon its start, will not use:
- CPU utilization.
- Monitoring metric values that are applied according to the UTILIZATION rule.
Average metric values for the group will be used instead.
Utilization measurement period: Metric value will be calculated as an average of all measurements taken during the specified period.

For example, the CPU load may rise to 100% in one second and then drop to 10%. To ignore such surges, Instance Groups will use average values for the specified period, such as one minute.

You can also set limits on the number of instances per group:

Maximum group size: Instance Groups will not create more instances if a group already contains this many.
Minimum size in a single availability zone: Instance Groups will not delete instances from an availability zone if there are only this many instances in the zone.

Metrics for automatic scaling

You can use the following metrics for automatic scaling:

CPU utilization.
Any metrics from Yandex Monitoring.

Instance Groups can control the group size to maintain average CPU utilization within the target level. The average CPU utilization is calculated for an instance separately from each availability zone or from the entire group (for the zonal or regional scaling type, respectively).

Here is what Instance Groups will do outside the stabilization period:

Calculate the average CPU utilization during the specified measurement period for each instance, except those that are still warming up. The load is measured several times per minute on every instance.
Use the obtained values to calculate the average load for each availability zone or across the entire group.

For example, let's assume there is a group of four instances located in one availability zone. One of the instances starts, while the others are under 90%, 75%, and 85% workload on average during the measurement period. The average load across the zone is: (90+75+85) / 3 = 83.4%
Obtain the total load, i.e., multiply the resulting average load by the total number of instances.

In our example, it is 83.4 × 4 = 333.6%
Divide the total load by the target load level to obtain the number of instances required (the result is rounded up).

Say, for example, the target level is 75%. This means that you need 333.6 / 75 = 4.48 ~ 5 instances. Based on the approximate results, you need to create another instance.

Once the number of instances is calculated and changed (if required), Instance Groups will start calculating the average load again.

Monitoring metrics

You can use up to three Monitoring metrics for automatic scaling in Instance Groups. To read the metrics, the service account linked to the instance group needs at least the monitoring.viewer role.

When using monitoring metrics, specify the following in Instance Groups:

Metric name you specified in Monitoring.
Labels you specified in Monitoring:
(optional) folder_id: ID of the folder. By default, it is the ID of the folder the group belongs to.
(optional) service: ID of the service. The default value is custom. Labels can be used to specify service metrics, such as service with the compute value for Compute Cloud.

You will also need specify other labels for this metric:

Metric type that affects how Instance Groups computes the average metric value:
- GAUGE: Used for metrics that show the metric value at a specific point in time, such as the number of requests per second to a server running on an instance. Instance Groups computes the average metric value for the specified averaging period.
- COUNTER: Used for metrics that grow uniformly over time, such as the total number of requests to a server running on an instance. Instance Groups calculates the average metric growth for the specified averaging period.
Metric rule type:
- UTILIZATION: Metric will show resource consumption by a single instance.
  
  The number of instances per availability zone or in the entire group (for the zonal or regional scaling type, respectively) by the UTILIZATION metric is calculated in the same way as the number of instances by CPU utilization.
  
  When delivered in Monitoring, the UTILIZATION metric must have the instance_id label.
- WORKLOAD: Metric will show the total workload on all instances in a single availability zone or the entire group (for the zonal or regional scaling type, respectively).
  
  To calculate the number of instances per availability zone or in the entire group by the WORKLOAD metric, the average metric value is divided by the target value, and the result is rounded up.
  
  For example, let's assume there are two instances in an availability zone. The metric shows the total number of requests per second (RPS) to all instances. If the target metric value is 200, then, with an average value of 450, Instance Groups will increase the number of instances in the availability zone to three: 450/200 = 2.25 ~ 3 instances.
  
  The metric value is also calculated and used during the instance warm-up period specified in the general settings.
  
  If zonal scaling is applied to the group, when delivered in Monitoring, the WORKLOAD metric must have the zone_id label.
Target metric value by which Instance Groups calculates the required number of VM instances. For UTILIZATION metrics, the target value is the required level of resource consumption by each instance. For WORKLOAD metrics, it is the maximum allowed workload on each instance.

Scaling types

Manually scaled groups

Automatically scaled groups

Type of automatic scaling

General settings

Metrics for automatic scaling

CPU utilization

Monitoring metrics

See also

Was the article helpful?

Scaling types

Manually scaled groupsManually scaled groups

Automatically scaled groupsAutomatically scaled groups

Type of automatic scalingType of automatic scaling

General settingsGeneral settings

Metrics for automatic scalingMetrics for automatic scaling

CPU utilizationCPU utilization

Monitoring metricsMonitoring metrics

See alsoSee also

Was the article helpful?

Manually scaled groups

Automatically scaled groups

Type of automatic scaling

General settings

Metrics for automatic scaling

CPU utilization

Monitoring metrics

See also