Creating an Apache Kafka® cluster
A Managed Service for Apache Kafka® cluster is one or more broker hosts where topics and their partitions are located. Producers and consumers can work with these topics by connecting to Managed Service for Apache Kafka® cluster hosts.
Note
- The number of broker hosts you can create along with a Managed Service for Apache Kafka® cluster depends on the selected disk type and host class.
- Available disk types depend on the selected host class.
Warning
If you create a cluster with more than one host, three dedicated ZooKeeper hosts will be added to the cluster. For more information, see Relationship between resources in Managed Service for Apache Kafka®.
Creating Managed Service for Apache Kafka® clusters
Prior to creating a Managed Service for Apache Kafka® cluster, calculate the minimum storage size for topics.
-
In the management console
, go to the appropriate folder. -
In the list of services, select Managed Service for Kafka.
-
Click Create cluster.
-
Under Basic parameters:
-
Enter a name and description for the Managed Service for Apache Kafka® cluster. The Managed Service for Apache Kafka® cluster name must be unique within the folder.
-
Select the environment where you want to create the Managed Service for Apache Kafka® cluster (you cannot change the environment once the cluster is created):
PRODUCTION
: For stable versions of your apps.PRESTABLE
: For testing purposes. The prestable environment is similar to the production environment and likewise covered by the SLA, but it is the first to get new functionalities, improvements, and bug fixes. In the prestable environment, you can test compatibility of new versions with your application.
-
Select the Apache Kafka® version.
-
To manage data schemas using Managed Schema Registry, enable the Schema registry setting.
Warning
Once enabled, the Schema registry setting cannot be disabled.
-
-
Under Host class, select the platform, host type, and host class.
The host class defines the technical specifications of the VMs that Apache Kafka® brokers will be deployed on. All available options are listed under Host classes.
Changing the host class for a Managed Service for Apache Kafka® cluster changes the characteristics of all instances already created.
-
Under Storage:
-
Select the disk type.
The selected type determines the increments in which you can change your disk size:
- Network HDD and SSD storage: In 1 GB increments.
- Local SSD storage:
- For Intel Cascade Lake: In increments of 100 GB.
- For Intel Ice Lake: In 368 GB increments.
- Non-replicated SSD storage: In increments of 93 GB.
You cannot change the disk type for a Managed Service for Apache Kafka® cluster once you create it.
-
Select the storage size to use for data.
-
-
Under Network settings:
-
Select one or more availability zones to host Apache Kafka® brokers. If you create a Managed Service for Apache Kafka® cluster with one availability zone, you will not be able to increase the number of zones and brokers in the future.
-
Select a network.
-
Select subnets in each availability zone for this network. To create a new subnet, click Create new next to the availability zone in question.
Note
For a Managed Service for Apache Kafka® cluster with multiple broker hosts, you need to specify subnets in each availability zone even if you plan to host brokers only in some of them. These subnets are required to host three ZooKeeper hosts, one in each availability zone. For more information, see Resource relationships in the service.
-
Select security groups for the Managed Service for Apache Kafka® cluster's network traffic.
-
To access broker hosts from the internet, select Public access. In this case, you can only connect to them over an SSL connection. For more information, see Connecting to topics in a cluster.
-
-
Under Hosts:
-
Specify the number of Apache Kafka® broker hosts to be located in each of the selected availability zones.
When choosing the number of hosts, keep in mind that:
-
Replication is possible if there are at least two hosts per Managed Service for Apache Kafka® cluster.
-
If you selected
local-ssd
ornetwork-ssd-nonreplicated
under Storage, you need to add at least three hosts to the Managed Service for Apache Kafka® cluster. -
To make your Managed Service for Apache Kafka® cluster fault-tolerant, you will need to meet certain conditions.
-
Adding more than one host to the Managed Service for Apache Kafka® cluster automatically adds three ZooKeeper hosts.
-
-
(Optional) Select groups of dedicated hosts to host the Managed Service for Apache Kafka® cluster.
Alert
You cannot edit this setting after you create a cluster. The use of dedicated hosts significantly affects cluster pricing.
-
-
If you specify several broker hosts, under ZooKeeper host class, specify the characteristics of the ZooKeeper hosts to place in each of the selected availability zones.
-
Configure additional Managed Service for Apache Kafka® cluster settings, if required:
-
Maintenance window: Maintenance window settings:
- To enable maintenance at any time, select arbitrary (default).
- To specify the preferred maintenance start time, select by schedule and specify the desired day of the week and UTC hour. For example, you can choose a time when the cluster is least loaded.
Maintenance operations are carried out both on enabled and disabled clusters. They may include updating the DBMS, applying patches, and so on.
-
Data Transfer access: Enable this option to allow access to the cluster from Yandex Data Transfer in Serverless mode.
This will enable you to connect to Yandex Data Transfer running in Kubernetes via a special network. It will also cause other operations to run faster, such as transfer launch and deactivation.
-
Deletion protection: Manages cluster protection from accidental deletion by a user.
Cluster deletion protection will not prevent a manual connection to a cluster to delete data.
-
Schema registry: Enable this option to manage data schemas using Managed Schema Registry.
Warning
Once enabled, the Schema registry setting cannot be disabled.
-
-
Configure the Apache Kafka® settings, if required.
-
Click Create.
-
Wait until the Managed Service for Apache Kafka® cluster is ready: its status on the Managed Service for Apache Kafka® dashboard will change to
Running
and its state toAlive
. This may take some time.
If you do not have the Yandex Cloud command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name
or --folder-id
parameter.
-
View a description of the create Managed Service for Apache Kafka® cluster CLI command:
yc managed-kafka cluster create --help
-
Specify the Managed Service for Apache Kafka® cluster parameters in the create command (the example shows only some of the parameters):
yc managed-kafka cluster create \ --name <cluster_name> \ --environment <environment> \ --version <version> \ --network-name <network_name> \ --subnet-ids <subnet_IDs> \ --brokers-count <number_of_brokers_per_zone> \ --resource-preset <host_class> \ --disk-type <disk_type> \ --disk-size <storage_size_GB> \ --assign-public-ip <public_access> \ --security-group-ids <list_of_security_group_IDs> \ --deletion-protection=<deletion_protection>
Where:
-
--environment
: Cluster environment,prestable
orproduction
. -
--version
: Apache Kafka® version. Acceptable values: 2.8, 3.0, 3.1, or 3.2. -
--disk-type
: Storage type,local-ssd
orlocal-hdd
. -
--deletion-protection
: Cluster protection from accidental deletion by a user:true
orfalse
.
Tip
You can also configure the Apache Kafka® settings here, if required.
Cluster deletion protection will not prevent a manual connection to a cluster to delete data.
-
-
To set up a maintenance window (including for disabled Managed Service for Apache Kafka® clusters), provide the required value in the
--maintenance-window
parameter when creating your cluster:yc managed-kafka cluster create \ ... --maintenance-window type=<maintenance_type>,` `day=<day_of_week>,` `hour=<hour_of_day> \
Where
type
is the maintenance type:anytime
(default): Any time.weekly
: On a schedule. If setting this value, specify the day of week and the hour:day
: Day of week inDDD
format:MON
,TUE
,WED
,THU
,FRI
,SAT
, orSUN
.hour
: Hour (UTC) inHH
format:1
to24
.
-
To allow access to the cluster from Yandex Data Transfer in Serverless mode, pass the
--datatransfer-access
parameter.This will enable you to connect to Yandex Data Transfer running in Kubernetes via a special network. It will also cause other operations to run faster, such as transfer launch and deactivation.
-
To create a Managed Service for Apache Kafka® cluster hosted on groups of dedicated hosts, specify the host IDs as a comma-separated list in the
--host-group-ids
parameter when creating the cluster:yc managed-kafka cluster create \ ... --host-group-ids=<IDs_of_groups_of_dedicated_hosts>
Alert
You cannot edit this setting after you create a cluster. The use of dedicated hosts significantly affects cluster pricing.
Terraform
For more information about the provider resources, see the documentation on the Terraform
If you change the configuration files, Terraform automatically detects which part of your configuration is already deployed, and what should be added or removed.
If you don't have Terraform, install it and configure the Yandex Cloud provider.
To create a Managed Service for Apache Kafka® cluster:
-
In the configuration file, describe the parameters of the resources you want to create:
-
Managed Service for Apache Kafka® cluster: Description of a cluster and its hosts. You can also configure the Apache Kafka® settings here, if required.
-
Network: Description of the cloud network where a cluster will be located. If you already have a suitable network, you don't have to describe it again.
-
Subnets: Description of the subnets to connect the cluster hosts to. If you already have suitable subnets, you don't have to describe them again.
Here is an example of the configuration file structure:
resource "yandex_mdb_kafka_cluster" "<cluster_name>" { environment = "<environment>" name = "<cluster_name>" network_id = "<network_ID>" subnet_ids = ["<list_of_subnet_IDs>"] security_group_ids = ["<list_of_cluster_security_group_IDs>"] deletion_protection = <deletion_protection> config { assign_public_ip = "<public_access>" brokers_count = <number_of_brokers> version = "<version>" schema_registry = "<data_schema_management>" kafka { resources { disk_size = <storage_size_GB> disk_type_id = "<disk_type>" resource_preset_id = "<host_class>" } kafka_config {} } zones = [ "<availability_zones>" ] } } resource "yandex_vpc_network" "<network_name>" { name = "<network_name>" } resource "yandex_vpc_subnet" "<subnet_name>" { name = "<subnet_name>" zone = "<availability_zone>" network_id = "<network_ID>" v4_cidr_blocks = ["<range>"] }
Where:
environment
: Cluster environment,PRESTABLE
orPRODUCTION
.deletion_protection
: Cluster deletion protection,true
orfalse
.assign_public_ip
: Public access to the cluster,true
orfalse
.version
: Apache Kafka® version, 2.8, 3.0, 3.1, or 3.2.schema_registry
: Data schema management,true
orfalse
.
Cluster deletion protection will not prevent a manual connection to a cluster to delete data.
To set up the maintenance window (for disabled clusters as well), add the
maintenance_window
section to the cluster description:resource "yandex_mdb_kafka_cluster" "<cluster_name>" { ... maintenance_window { type = <maintenance_type> day = <day_of_week> hour = <hour> } ... }
Where:
type
: Maintenance type. The possible values include:anytime
: Anytime.weekly
: By schedule.
day
: Day of the week for theweekly
type inDDD
format, e.g.,MON
.hour
: Hour of the day for theweekly
type in theHH
format, e.g.,21
.
-
-
Make sure the settings are correct.
-
Using the command line, navigate to the folder that contains the up-to-date Terraform configuration files with an infrastructure plan.
-
Run the command:
terraform validate
If there are errors in the configuration files, Terraform will point to them.
-
-
Create a Managed Service for Apache Kafka® cluster.
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
After this, all required resources will be created in the specified folder, and the FQDNs of the Managed Service for Apache Kafka® cluster hosts will be displayed in the terminal. You can check the new resources and their configuration using the management console
. -
For more information, see the Terraform provider documentation
Time limits
The Terraform provider limits the amount of time for all Managed Service for Apache Kafka® cluster operations to complete to 60 minutes.
Operations exceeding the set timeout are interrupted.
Add the timeouts
block to the cluster description, for example:
resource "yandex_mdb_kafka_cluster" "<cluster_name>" {
...
timeouts {
create = "1h30m" # 1 hour 30 minutes
update = "2h" # 2 hours
delete = "30m" # 30 minutes
}
}
To create a Managed Service for Apache Kafka® cluster, use the create REST API method for the Cluster resource or the ClusterService/Create gRPC API call and provide the following in the request:
-
ID of the folder where the Managed Service for Apache Kafka® cluster should be placed, in the
folderId
parameter. -
Managed Service for Apache Kafka® cluster name in the
name
parameter. -
Security group IDs in the
securityGroupIds
parameter. -
Settings for the maintenance window (including those for disabled Managed Service for Apache Kafka® clusters) in the
maintenanceWindow
parameter. -
Managed Service for Apache Kafka® cluster deletion protection settings in the
deletionProtection
parameter.Cluster deletion protection will not prevent a manual connection to a cluster to delete data.
To manage data schemas using Managed Schema Registry, set the configSpec.schemaRegistry
parameter to true
. You cannot edit this setting after you create a Managed Service for Apache Kafka® cluster.
To allow access to the cluster from Yandex Data Transfer in Serverless mode, pass true
for the configSpec.access.dataTransfer
parameter.
This will enable you to connect to Yandex Data Transfer running in Kubernetes via a special network. It will also cause other operations to run faster, such as transfer launch and deactivation.
To create a Managed Service for Apache Kafka® cluster deployed on groups of dedicated hosts, provide a list of host IDs in the hostGroupIds
parameter.
Alert
You cannot edit this setting after you create a cluster. The use of dedicated hosts significantly affects cluster pricing.
Warning
If you specified security group IDs when creating a Managed Service for Apache Kafka® cluster, you may also need to configure security groups to connect to the cluster.
Importing clusters to Terraform
Using import, you can bring the existing clusters under the Terraform management.
-
In the Terraform configuration file, specify the cluster you want to import:
resource "yandex_mdb_kafka_cluster" "<cluster_name>" {}
-
Run the following command to import the cluster:
terraform import yandex_mdb_kafka_cluster.<cluster_name> <cluster_ID>
To learn more about importing clusters, see the Terraform provider documentation
.
Examples
Creating a single-host cluster
Create a Managed Service for Apache Kafka® cluster with the following test characteristics:
- Name:
mykf
- Environment:
Production
- Apache Kafka® version:
3.2
- Network:
default
- Subnet ID:
b0rcctk2rvtr8efcch64
- Security group:
enp6saqnq4ie244g67sb
- Number of
s2.micro
hosts in theru-central1-a
availability zone: 1 - Number of brokers: 1
- Network SSD storage (
network-ssd
): 10 GB - Public access: Allowed
- Protection against accidental Managed Service for Apache Kafka® cluster deletion: Enabled
Run the following command:
yc managed-kafka cluster create \
--name mykf \
--environment production \
--version 3.2 \
--network-name default \
--subnet-ids b0rcctk2rvtr8efcch64 \
--zone-ids ru-central1-a \
--brokers-count 1 \
--resource-preset s2.micro \
--disk-size 10 \
--disk-type network-ssd \
--assign-public-ip \
--security-group-ids enp6saqnq4ie244g67sb \
--deletion-protection=true
Create a Managed Service for Apache Kafka® cluster with the following test characteristics:
-
Cloud ID:
b1gq90dgh25bebiu75o
-
Folder ID:
b1gia87mbaomkfvsleds
-
Name:
mykf
-
Environment:
PRODUCTION
-
Apache Kafka® version:
3.2
-
Network and subnet:
mynet
,mysubnet
-
Security group:
mykf-sg
(allow ingress connections to the Managed Service for Apache Kafka® cluster on port9091
) -
Number of
s2.micro
hosts in theru-central1-a
availability zone: 1 -
Number of brokers: 1
-
Network SSD storage (
network-ssd
): 10 GB -
Public access: Allowed
-
Protection against accidental Managed Service for Apache Kafka® cluster deletion: Enabled
The configuration file for this Managed Service for Apache Kafka® cluster is as follows:
resource "yandex_mdb_kafka_cluster" "mykf" {
environment = "PRODUCTION"
name = "mykf"
network_id = yandex_vpc_network.mynet.id
subnet_ids = yandex_vpc_subnet.mysubnet.id
security_group_ids = [ yandex_vpc_security_group.mykf-sg.id ]
deletion_protection = true
config {
assign_public_ip = true
brokers_count = 1
version = "3.2"
kafka {
resources {
disk_size = 10
disk_type_id = "network-ssd"
resource_preset_id = "s2.micro"
}
kafka_config {}
}
zones = [
"ru-central1-a"
]
}
}
resource "yandex_vpc_network" "mynet" {
name = "mynet"
}
resource "yandex_vpc_subnet" "mysubnet" {
name = "mysubnet"
zone = "ru-central1-a"
network_id = yandex_vpc_network.mynet.id
v4_cidr_blocks = ["10.5.0.0/24"]
}
resource "yandex_vpc_security_group" "mykf-sg" {
name = "mykf-sg"
network_id = yandex_vpc_network.mynet.id
ingress {
description = "Kafka"
port = 9091
protocol = "TCP"
v4_cidr_blocks = [ "0.0.0.0/0" ]
}
}