Backend groups
A backend group defines the settings based on which the L7 load balancer sends traffic to the backend endpoints:
- The protocol for connecting to the application instances.
- Settings for the endpoint health checks.
- Rules for traffic distribution between endpoints.
The backend group includes a list of backends. Each backend, depending on its type, points to resources that act as application endpoints: VMs in target groups or a bucket with files. You can assign a relative weight to each backend. Traffic between backends is distributed proportionally to these weights. Protocols, health checks, and traffic distribution are configured separately for each backend. By using a group of multiple backends, you can split traffic between different application versions when running updates or experiments.
If a backend group is used in at least one HTTP router or load balancer, you cannot delete it. First, you need to delete it from all HTTP routers and load balancers.
Backend group types
The type of a backend group determines what traffic the load balancer will send to it:
- HTTP: HTTP or HTTPS traffic.
- gRPC: HTTP or HTTPS traffic with a gRPC
call. - Stream: Unencrypted TCP traffic or TCP traffic with TLS encryption support.
Groups of the HTTP and gRPC types connect to listeners of the HTTP type via HTTP routers. Groups of the Stream type connect to Stream listeners directly.
Alert
You can only select a backend group's type when creating it. You can't change the type of an existing group.
Backend types
Backends in HTTP groups can be of two types:
-
One or more target groups: Sets of IP addresses of Compute Cloud VM instances that your network applications are running on. Traffic between all VMs in target groups belonging to the same backend is distributed evenly based on the backend settings and results of health checks.
-
Object Storage bucket: Set of files (objects) and settings related to their storage. For more information, see Bucket in Object Storage in the Object Storage documentation.
Warning
To use a bucket as a backend, grant public access to the list of objects in the bucket and permission to read them.
In gRPC and Stream groups, only target groups and their sets can act as backends.
Session affinity
If you want requests from one user session to be processed by the same application endpoint, enable session affinity for a backend group.
Note
Currently, session affinity only works if a single backend is active (has a positive weight) in a group of backends, includes one or more target groups, and the MAGLEV_HASH
load balancing mode is selected for it.
Session affinity mode determines how incoming requests are grouped into one session: HTTP and gRPC backend groups support the following modes:
-
By IP address: Requests received from the same IP are combined into a session.
-
By HTTP header: Requests with the same value of the specified HTTP header, such as with user authentication data, are combined into a session.
-
By cookie: Requests with the same cookie value and the specified file name are combined into a session.
- If session affinity settings include cookie lifetime, the load balancer generates a cookie with a unique value and sends it in its response to a user's first request. To use session cookies that are stored on a client, such as a browser, and reset when it restarts, specify a lifetime of
0
. - If a lifetime is not specified, the load balancer does not generate cookies. Instead, it only uses cookie values from incoming requests to bind sessions.
- If session affinity settings include cookie lifetime, the load balancer generates a cookie with a unique value and sends it in its response to a user's first request. To use session cookies that are stored on a client, such as a browser, and reset when it restarts, specify a lifetime of
Stream backend groups only support session affinity by client IP address.
Protocol and load balancing settings
For backends consisting of target groups, you can configure:
- A protocol for communicating with the load balancer.
- Traffic balancing between endpoints.
Protocol
The load balancer can establish unencrypted backend connections and backend connections with TLS encryption. When using TLS, the load balancer does not validate certificates returned by backends. However, you can specify certificates from Certificate Authorities that the load balancer will trust when establishing a secure connection with backend endpoints.
If the type of a backend group is HTTP, you can use HTTP 1.1 or HTTP 2 for exchanging data between the load balancer and backend endpoints. Backend groups of the gRPC type only support HTTP/2 connections.
Balancing mode
In the backend settings, you can specify the mode for distributing traffic between backend endpoints (target group VMs):
-
ROUND_ROBIN
: All endpoints will receive requests in turn. After all the endpoints receive one request each, it's the turn of the first endpoint again, and so on. -
RANDOM
: A random endpoint is selected to process a request. If no health checks are configured for the backend, random distribution helps avoid increased workloads on the endpoint, which, under round-robin distribution, would be in the queue after a non-working endpoint. -
LEAST_REQUEST
: Requests are distributed based on endpoint load using the power of two random choices algorithm. Two backend endpoints are randomly selected and the request is received by the one with fewer connections. The algorithm reduces the load on the most loaded backend endpoint. For more information about the performance and efficiency of the algorithm, see The Power of Two Random Choices: A Survey of Techniques and Results (Mitzenmacher et al.). -
MAGLEV_HASH
: Requests are distributed using the Maglev hashing algorithm.- For each endpoint, the hash function value is calculated in the range from
0
to65536
. - Based on the resulting values, a hash table of 65537 rows is fully populated so that each endpoint corresponds to the same number of rows.
- For each incoming request, the load balancer calculates the value of the same hash function and finds a row with the corresponding number in the hash table. This row indicates the endpoint that will process the request. If session affinity is enabled for a group of backends, a hash function is evaluated based on the client's IP address, the HTTP header value, or the cookie file depending on the affinity mode.
For more information about the performance and efficiency of the Maglev hashing algorithm, see Maglev: A Fast and Reliable Software Network Load Balancer
(Eisenbud et al.; chapter 3.4).Note
If a group with session affinity enabled consists of several active backends, the backends with the
MAGLEV_HASH
mode selected will distribute traffic randomly whereas the remaining backends will operate in their selected modes. - For each endpoint, the hash function value is calculated in the range from
Panic mode
Panic mode safeguards you against failure of all app instances in case the data load increases drastically.
In this mode, the load balancer will distribute requests across all endpoints, ignoring health check results. You can set the percentage of healthy endpoints that triggers panic mode.
If you don't use panic mode, failure of some backends will further increase the load on backends that are still running. If an application is running at its maximum capacity, all backends will fail, making your service completely unavailable. If you enable panic mode, traffic is again distributed across all your endpoints. Although some requests might fail, the service stays operable. This provides time to increase the application's computing resources automatically or manually.
Locality aware routing
By default, the load balancer evenly distributes traffic between all endpoints of the backend's target groups. If the application is running in multiple availability zones, you can configure the L7 load balancer to send requests to endpoints in the availability zone where the load balancer accepted the request. If no backends are running in this availability zone, the load balancer will send the request to another zone.
If strict locality is enabled, the load balancer will respond with an error (503 Service Unavailable) if no application backends are running in the availability zone that accepted the request.
Health checks
You can enable health checks for backends consisting of target groups. The load balancer will send health check requests to the endpoints at certain intervals and wait for a response during a given timeout.
Health checks of the HTTP, gRPC, and Stream types are supported. They match the backend group types. However, the type of a health check does not have to be the same as the group type.
The following health check settings are supported:
-
Timeout: Response waiting time.
-
Interval: Amount of time between health check requests.
-
Resource health indicators: The threshold amount of successful or failed results. If a threshold is exceeded, it indicates that the check passed or failed, respectively.
-
HTTP health check settings:
- Domain name for the
Host
header (HTTP/1.1) or the:authority
pseudo-header (HTTP/2). - Path in the URI of a request to the endpoint.
- HTTP/2 flag.
- Domain name for the
-
Settings of gRPC health checks:
- Name of the service being checked.
-
Settings of Stream health checks (TCP):
- Request body.
- Substring in the response that indicates that the health check was successful. If the request body or response body is not specified, a successful connection to the backend is checked.
Note that if the backend is configured to use TLS with the target group endpoints, health checks also use TLS, e.g.:
- If the type of a health check is HTTP, it will be made over HTTPS.
- For Stream health checks, a TLS connection will be established and the check results will be returned through this connection.