Creating clusters
A cluster in Managed Service for Apache Kafka® is one or more broker hosts where topics and their partitions are located. Producers and consumers can work with these topics by connecting to cluster hosts.
The number of broker hosts that can be created together with a Apache Kafka® cluster depends on the selected type of storage:
- With local SSD or non-replicated SSD storage, you can create a cluster with three or more broker hosts (a minimum of three broker hosts is required for fault tolerance).
- With HDD network or SSD network storage, you can add any number of broker hosts within the current quota.
After creating a cluster, you can add extra broker hosts to it if there are enough available folder resources.
Warning
If you create a cluster with more than one host, three dedicated ZooKeeper hosts will be added to the cluster. For more information, see Relationship between resources in Managed Service for Apache Kafka®.
How to create a Managed Service for Apache Kafka® cluster
-
Go to the folder page and select Managed Service for Apache Kafka®.
-
Click Create cluster.
-
Under Basic parameters:
-
Enter a name for the cluster and, if necessary, a description. The cluster name must be unique within the folder.
-
Select the environment where you want to create the cluster (you can't change the environment once the cluster is created):
PRODUCTION
: For stable versions of your apps.PRESTABLE
: For testing, including the Managed Service for Apache Kafka® service itself. The Prestable environment is first updated with new features, improvements, and bug fixes. However, not every update ensures backward compatibility.
-
Select the Apache Kafka® version.
-
To manage topics via the Apache Kafka® Admin API:
Alert
Once you create a cluster, you cannot change the Manage topics via the API setting.
- Enable Manage topics via the API.
- After creating a cluster, create an administrator account.
-
To manage data schemas using Managed Schema Registry, enable the Data Schema Registry setting.
Warning
You cannot edit the Data Schema Registry setting after a cluster is created.
-
-
Under Host class, select the platform, host type, and host class.
The host class defines the technical specifications of the VMs where the Apache Kafka® brokers will be deployed. All available options are listed in Host classes.
By changing the host class for a cluster, you also change the characteristics of all the existing instances.
-
Under Storage:
-
Select the type of storage.
The selected type determines the increment that you can change your storage size in:
- Local SSD storage for Intel Cascade Lake: In increments of 100 GB.
- Local SSD storage for Intel Ice Lake: In increments of 368 GB.
- Non-replicated SSD storage: In increments of 93 GB.
-
Select the size of storage to be used for data.
-
-
Under Network settings:
-
Select one or more availability zones where the Apache Kafka® brokers will reside.
-
Select the network.
-
Select subnets in each availability zone for this network. To create a new subnet, click Create new subnet next to the desired availability zone.
Note
For a cluster with multiple broker hosts, you need to specify subnets in each availability zone even if you plan to host brokers only in some of them. These subnets are required to host three ZooKeeper hosts — one in each availability zone. For more information, see Resource relationships in Managed Service for Apache Kafka®.
-
Select security groups to control the cluster's network traffic.
-
To access broker hosts from the internet, select Public access. In this case, you can only connect to them over an SSL connection. For more information, see Connecting to topics in an Apache Kafka® cluster.
Warning
You can't request public access after creating a cluster.
-
-
Under Hosts:
-
Specify the number of Apache Kafka® broker hosts to be located in each of the selected availability zones.
When choosing the number of hosts, keep in mind that:
- The Apache Kafka® cluster hosts will be evenly deployed in the selected availability zones. Decide on the number of zones and hosts per zone based on the required fault tolerance model and cluster load.
- Replication is possible if there are at least two hosts in the cluster.
- If you selected
local-ssd
ornetwork-ssd-nonreplicated
under Storage, you need to add at least 3 hosts to the cluster. - Adding more than one host to the cluster automatically adds three ZooKeeper hosts.
- (Optional) Select groups of dedicated hosts to host the cluster on.
Alert
You cannot edit this setting after you create a cluster. The use of dedicated hosts significantly affects cluster pricing.
-
-
If you specify two or more broker hosts, under Host class ZooKeeper, specify the characteristics of the hostsZooKeeper to be located in each of the selected availability zones.
-
If necessary, configure additional cluster settings:
Deletion protection: Enable this option to protect a cluster from accidental deletion by your cloud's users.
Cluster deletion protection will not prevent a manual connection to a cluster to delete data.
-
If necessary, configure the Apache Kafka® settings.
-
Click Create cluster.
-
Wait until the cluster is ready: its status on the Managed Service for Apache Kafka® dashboard changes to Running and its state to Alive. This may take some time.
If you don't have the Yandex Cloud command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name
or --folder-id
parameter.
-
View a description of the CLI create cluster command:
yc managed-kafka cluster create --help
-
Specify the cluster parameters in the create command (only some of the supported parameters are given in the example):
yc managed-kafka cluster create \ --name <cluster name> \ --environment <prestable or production> \ --version <version: 2.1, 2.6, or 2.8> \ --network-name <network name> \ --brokers-count <number of brokers in the zone> \ --resource-preset <host class> \ --disk-type <network-hdd | network-ssd | local-ssd | network-ssd-nonreplicated> \ --disk-size <storage size in GB> \ --assign-public-ip <public access> \ --security-group-ids <list of security group IDs> \ --deletion-protection=<protect cluster from deletion: true or false>
Cluster deletion protection will not prevent a manual connection to a cluster to delete data.
If necessary, you can also configure the Apache Kafka® settings here.
-
To manage topics via the Apache Kafka® Admin API:
Alert
Once you create a cluster, you cannot change the Manage topics via the API setting.
-
When creating a cluster, set the
--unmanaged-topics
parameter totrue
:yc managed-kafka cluster create \ ... --unmanaged-topics true
-
After creating a cluster, create an administrator account.
-
-
To create a cluster hosted using groups of dedicated hosts, list their IDs separated by commas in the
--host-group-ids
parameter when creating a cluster:yc managed-kafka cluster create \ ... --host-group-ids <IDs of dedicated host groups>
Alert
You cannot edit this setting after you create a cluster. The use of dedicated hosts significantly affects cluster pricing.
With Terraform, you can quickly create a cloud infrastructure in Yandex Cloud and manage it by configuration files. They store the infrastructure description in HashiCorp Configuration Language (HCL). Terraform and its providers are distributed under the Mozilla Public License.
For more information about the provider resources, see the documentation on the Terraform site or mirror site.
If you change the configuration files, Terraform automatically determines which part of your configuration is already deployed and what should be added or removed.
If you don't have Terraform, install it and configure the provider.
To create a cluster:
-
In the configuration file, describe the parameters of resources that you want to create:
-
Apache Kafka® cluster: Description of a cluster and its hosts. If necessary, you can also configure the Apache Kafka® settings here.
-
Network: Description of the cloud network where a cluster will be located. If you already have a suitable network, you don't have to describe it again.
-
Subnets: Description of the subnets to connect the cluster hosts to. If you already have suitable subnets, you don't have to describe them again.
Example configuration file structure:
terraform { required_providers { yandex = { source = "yandex-cloud/yandex" } } } provider "yandex" { token = "<OAuth or static key of service account>" cloud_id = "<cloud ID>" folder_id = "<folder ID>" zone = "<availability zone>" } resource "yandex_mdb_kafka_cluster" "<cluster name>" { environment = "<environment: PRESTABLE or PRODUCTION>" name = "<cluster name>" network_id = "<network ID>" security_group_ids = ["<list of cluster security group IDs>"] deletion_protection = <protect cluster from deletion: true or false> config { assign_public_ip = "<public access to the cluster: true or false>" brokers_count = <number of brokers> version = "<Apache Kafka version: 2.1, 2.6, or 2.8>" schema_registry = "<data schema management: true or false>" kafka { resources { disk_size = <storage size in GB> disk_type_id = "<storage type: network-ssd, network-hdd, network-ssd-nonreplicated, or local-ssd>" resource_preset_id = "<host class>" } } zones = [ "<availability zones>" ] } } resource "yandex_vpc_network" "<network name>" { name = "<network name>" } resource "yandex_vpc_subnet" "<subnet name>" { name = "<subnet name>" zone = "<availability zone>" network_id = "<network ID>" v4_cidr_blocks = ["<range>"] }
Cluster deletion protection will not prevent a manual connection to a cluster to delete data.
-
-
Make sure the settings are correct.
-
Using the command line, navigate to the folder that contains the up-to-date Terraform configuration files with an infrastructure plan.
-
Run the command:
terraform validate
If there are errors in the configuration files, Terraform will point to them.
-
-
Create a cluster.
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
After this, all the necessary resources will be created in the specified folder and the IP addresses of the VMs will be displayed in the terminal. You can check resource availability and their settings in the management console.
-
For more information, see the Terraform provider documentation.
Use the create API method and pass the following information in the request:
-
The ID of the folder where the cluster should be placed in the
folderId
parameter. -
The cluster name in the
name
parameter. -
Security group IDs in the parameter
securityGroupIds
. -
Cluster deletion protection settings in the
deletionProtection
parameter.Cluster deletion protection will not prevent a manual connection to a cluster to delete data.
To manage topics via the Apache Kafka® Admin API:
Alert
Once you create a cluster, you cannot change the Manage topics via the API setting.
- Pass
true
for theunmanagedTopics
parameter. - After creating a cluster, create an administrator account.
To manage data schemas using Managed Schema Registry, pass the true
value for the configSpec.schemaRegistry
parameter.
Warning
You cannot edit the Data Schema Registry setting after a cluster is created.
To create a cluster hosted using groups of dedicated hosts, pass their IDs in the hostGroupIds
parameter.
Alert
You cannot edit this setting after you create a cluster. The use of dedicated hosts significantly affects cluster pricing.
Warning
If you specified security group IDs when creating a cluster, you may also need to re-configure security groups to connect to the cluster.
Examples
Creating a single-host cluster
Let's say we need to create a Managed Service for Apache Kafka® cluster with the following characteristics:
- With the name
mykf
. - In the
production
environment. - With Apache Kafka® version
2.6
. - In the
default
network. - In the security group
enp6saqnq4ie244g67sb
. - With one
s2.micro
host in theru-central1-c
availability zone. - With one broker.
- With 10 GB of SSD network storage (
network-ssd
). - With public access.
- With accidental cluster deletion protection.
Run the command:
yc managed-kafka cluster create \
--name mykf \
--environment production \
--version 2.6 \
--network-name default \
--zone-ids ru-central1-c \
--brokers-count 1 \
--resource-preset s2.micro \
--disk-size 10 \
--disk-type network-ssd \
--assign-public-ip \
--security-group-ids enp6saqnq4ie244g67sb \
--deletion-protection=true
Let's say we need to create a Managed Service for Apache Kafka® cluster with the following characteristics:
- In the cloud with the ID
b1gq90dgh25bebiu75o
. - In the folder with the ID
b1gia87mbaomkfvsleds
. - With the name
mykf
. - In the
PRODUCTION
environment. - With Apache Kafka® version
2.6
. - In the new
mynet
network with the subnetmysubnet
. - In the new security group
mykf-sg
allowing connection to the cluster from the Internet via port9091
. - With one
s2.micro
host in theru-central1-c
availability zone. - With one broker.
- With 10 GB of SSD network storage (
network-ssd
). - With public access.
- With accidental cluster deletion protection.
The configuration file for the cluster looks like this:
terraform {
required_providers {
yandex = {
source = "yandex-cloud/yandex"
}
}
}
provider "yandex" {
token = "<OAuth or static key of service account>"
cloud_id = "b1gq90dgh25bebiu75o"
folder_id = "b1gia87mbaomkfvsleds"
zone = "ru-central1-c"
}
resource "yandex_mdb_kafka_cluster" "mykf" {
environment = "PRODUCTION"
name = "mykf"
network_id = yandex_vpc_network.mynet.id
security_group_ids = [ yandex_vpc_security_group.mykf-sg.id ]
deletion_protection = true
config {
assign_public_ip = true
brokers_count = 1
version = "2.6"
kafka {
resources {
disk_size = 10
disk_type_id = "network-ssd"
resource_preset_id = "s2.micro"
}
}
zones = [
"ru-central1-c"
]
}
}
resource "yandex_vpc_network" "mynet" {
name = "mynet"
}
resource "yandex_vpc_subnet" "mysubnet" {
name = "mysubnet"
zone = "ru-central1-c"
network_id = yandex_vpc_network.mynet.id
v4_cidr_blocks = ["10.5.0.0/24"]
}
resource "yandex_vpc_security_group" "mykf-sg" {
name = "mykf-sg"
network_id = yandex_vpc_network.mynet.id
ingress {
description = "Kafka"
port = 9091
protocol = "TCP"
v4_cidr_blocks = [ "0.0.0.0/0" ]
}
}