Manually scaled groups
You can create fixed-size instance groups and manage their size manually based on your current computing needs.
Automatically scaled groups
When creating an automatically scaled instance group, you specify the target metric value, while the service continuously re-adjusts the number of instances:
- If the average metric value rises above the target, Instance Groups creates new instances in the group.
- If the average value decreases so that it's below the target value with a smaller group, then Instance Groups deletes unnecessary instances.
For example, there are 4 instances in an availability zone with an average metric value of 70 and target value of 80. Instance Groups doesn't reduce the group size, because as you delete an instance, the average value surpasses the target value: 4 × 70 / 3 = 93.3. When the average value drops to 60, Instance Groups deletes one instance since the average value doesn't surpass the target: 4 × 60 / 3 = 80.
Type of automatic scaling
The service can adjust the number of instances separately in each availability zone specified in the group settings or in the entire instance group:
- If zonal scaling is used, the service calculates for each availability zone its own average metric value for scaling and the required number of instances. The default type of automatic scaling is zonal.
- If regional scaling is used, the metric value and the number of instances are calculated for the entire group. To change the group's auto scaling type to regional, pass the
auto_scalescaling policy with the
To reduce adjustment sensitivity, Instance Groups lets you configure:
Stabilization period: After the number of instances in the group increases, instances aren't deleted until the end of the stabilization period, even if the average metric value is fairly low.
Warm-up period: The period during which the following isn't used after an instance starts.
Average metric values for the group are used instead.
Utilization measurement period: The metric value is calculated as the average of all measurements made during the specified period.
For example, the CPU load may rise to 100% one second and drop to 10% the next. To ignore such surges, Instance Groups uses average values for a given period, such as 1 minute.
You can also set limits on the number of instances per group:
- Maximum group size: Instance Groups won't create more instances if a group already contains this many.
- Minimum size in a single availability zone: Instance Groups won't delete instances from an availability zone if there are only this many instances in the zone.
Metrics for automatic scaling
You can use the following metrics for automatic scaling:
Instance Groups can control the group size to maintain average CPU utilization within the target level. The average CPU utilization is calculated for an instance separately from each availability zone or from the entire group (for the zonal or regional scaling type, respectively).
Let's look at the algorithm of service actions outside the stabilization period:
Calculate the average CPU utilization during the specified measurement period for each instance, except those in the warm-up period. The load is measured several times per minute on every instance.
Use the obtained values to calculate the average load for each availability zone or across the entire group.
For example, a group of 4 instances is located in one availability zone. One of the instances starts, while the others are under 90%, 75%, and 85% workload on average during the measurement period. Average load across the zone: (90+75+85) / 3 = 83.4%
Obtain the total load: multiply the resulting average load by the total number of instances.
In the example, 83.4 × 4 = 333.6%
Divide the total load by the target load level to obtain the number of instances needed (the result is rounded up).
Say, for example, the target level is 75%. This means that you need 333.6 / 75 = 4.48 ~ 5 instances. Based on the approximate results, you need to create another instance.
After the number of instances is calculated and changed (if necessary), Instance Groups starts calculating the average load again.
You can use any Yandex Monitoring metrics for automatic scaling in Instance Groups.
When using monitoring metrics, specify the following in Instance Groups:
The name of the metric that you specified in Yandex Monitoring.
Labels that you specified in Yandex Monitoring:
folder_id: ID of the folder. By default, it's the ID of the folder that the group belongs to.
service: ID of the service. By default,
custom. Labels can be used to specify service metrics, such as
computevalue for Compute Cloud.
Also specify other labels that characterize this metric.
The metric type that affects how Instance Groups computes the average metric value:
GAUGE: Used for metrics that show the metric value at a specific point in time, such as the number of requests per second to a server running on an instance. Instance Groups computes the average metric value for the specified averaging period.
COUNTER: Used for metrics that grow uniformly over time, such as the total number of requests to a server running on an instance. Instance Groups calculates the average metric growth for the specified averaging period.
Metric rule type:
UTILIZATION: The metric characterizes resource consumption by a single instance.
The number of instances per availability zone or in the entire group (for the zonal or regional scaling type, respectively) by the
UTILIZATIONmetric is calculated in the same way as the number of instances by CPU utilization.
When delivered in Yandex Monitoring, the
UTILIZATIONmetric must have the
WORKLOAD: The metric characterizes the total workload on all instances in a single availability zone or the entire group (for the zonal or regional scaling type, respectively).
To calculate the number of instances per availability zone or in the entire group by the
WORKLOADmetric, the average metric value is divided by the target value and the result is rounded up.
For example, there are two instances in an availability zone. The metric characterizes the total number of requests per second (RPS) to all instances. If the target metric value is 200, then, with an average value of 450, Instance Groups will increase the number of instances in the availability zone to three: 450/200 = 2.25 ~ 3 instances.
The metric value is calculated and used including during the instance warm-up period specified in the general settings.
If zonal scaling is applied to the group, when delivered in Yandex Monitoring, the
WORKLOADmetric must have the
The target metric value that Instance Groups uses to calculate the required number of instances. For
UTILIZATIONmetrics, the target value is the desired level of resource consumption by each instance. For
WORKLOADmetrics, it's the maximum allowed workload on each instance.