Yandex Cloud
  • Services
  • Solutions
  • Why Yandex Cloud
  • Pricing
  • Documentation
  • Contact us
Get started
Language / Region
© 2022 Yandex.Cloud LLC
Yandex Managed Service for Apache Kafka®
  • Getting started
  • Step-by-step instructions
    • All instructions
    • Information about existing clusters
    • Creating clusters
    • Connecting to clusters
    • Stopping and starting clusters
    • Changing cluster settings
    • Managing Apache Kafka® hosts
    • Working with topics and partitions
    • Managing Kafka accounts
    • Managing connectors
    • Viewing cluster logs
    • Deleting a cluster
    • Monitoring the state of a cluster and hosts
  • Practical guidelines
    • All use cases
    • Data delivery in Managed Service for ClickHouse
    • Configuring Kafka Connect for Managed Service for Apache Kafka®
    • Data delivery in ksqlDB
    • Using Confluent Schema Registry with Managed Service for Apache Kafka®
    • Migrating with MirrorMaker 2.0
    • Delivering data using Debezium
  • Concepts
    • Relationship between service resources
    • Topics and partitions
    • Brokers
    • Producers and consumers
    • Managing data schemas
    • Host classes
    • Network in Managed Service for Apache Kafka®
    • Quotas and limits
    • Storage types
    • Connectors
    • Apache Kafka® settings
  • Access management
  • Pricing policy
  • API reference
    • Authentication in the API
    • gRPC
      • Overview
      • ClusterService
      • ConnectorService
      • ResourcePresetService
      • TopicService
      • UserService
      • OperationService
    • REST
      • Overview
      • Cluster
        • Overview
        • create
        • delete
        • get
        • list
        • listHosts
        • listLogs
        • listOperations
        • move
        • rescheduleMaintenance
        • start
        • stop
        • streamLogs
        • update
      • Connector
        • Overview
        • create
        • delete
        • get
        • list
        • pause
        • resume
        • update
      • ResourcePreset
        • Overview
        • get
        • list
      • Topic
        • Overview
        • create
        • delete
        • get
        • list
        • update
      • User
        • Overview
        • create
        • delete
        • get
        • grantPermission
        • list
        • revokePermission
        • update
      • Operation
        • Overview
        • get
  • Questions and answers
  1. Step-by-step instructions
  2. Creating clusters

Creating clusters

Written by
Yandex.Cloud
  • How to create a Managed Service for Apache Kafka® cluster
  • Examples
    • Creating a single-host cluster

A cluster in Managed Service for Apache Kafka® is one or more broker hosts where topics and their partitions are located. Producers and consumers can work with these topics by connecting to cluster hosts.

The number of broker hosts that can be created together with a Apache Kafka® cluster depends on the selected type of storage:

  • With local SSD or non-replicated SSD storage, you can create a cluster with three or more broker hosts (a minimum of three broker hosts is required for fault tolerance).
  • With HDD network or SSD network storage, you can add any number of broker hosts within the current quota.

After creating a cluster, you can add extra broker hosts to it if there are enough available folder resources.

Warning

If you create a cluster with more than one host, three dedicated ZooKeeper hosts will be added to the cluster. For more information, see Relationship between resources in Managed Service for Apache Kafka®.

How to create a Managed Service for Apache Kafka® cluster

Management console
CLI
Terraform
API
  1. Go to the folder page and select Managed Service for Apache Kafka®.

  2. Click Create cluster.

  3. Under Basic parameters:

    1. Enter a name for the cluster and, if necessary, a description. The cluster name must be unique within the folder.

    2. Select the environment where you want to create the cluster (you can't change the environment once the cluster is created):

      • PRODUCTION: For stable versions of your apps.
      • PRESTABLE: For testing, including the Managed Service for Apache Kafka® service itself. The Prestable environment is first updated with new features, improvements, and bug fixes. However, not every update ensures backward compatibility.
    3. Select the Apache Kafka® version.

    4. To manage topics via the Apache Kafka® Admin API:

      Alert

      Once you create a cluster, you cannot change the Manage topics via the API setting.

      1. Enable Manage topics via the API.
      2. After creating a cluster, create an administrator account.
    5. To manage data schemas using Managed Schema Registry, enable the Data Schema Registry setting.

      Warning

      You cannot edit the Data Schema Registry setting after a cluster is created.

  4. Under Host class, select the platform, host type, and host class.

    The host class defines the technical specifications of the VMs where the Apache Kafka® brokers will be deployed. All available options are listed in Host classes.

    By changing the host class for a cluster, you also change the characteristics of all the existing instances.

  5. Under Storage:

    • Select the type of storage.

      The selected type determines the increment that you can change your storage size in:

      • Local SSD storage for Intel Cascade Lake: In increments of 100 GB.
      • Local SSD storage for Intel Ice Lake: In increments of 368 GB.
      • Non-replicated SSD storage: In increments of 93 GB.
    • Select the size of storage to be used for data.

  6. Under Network settings:

    1. Select one or more availability zones where the Apache Kafka® brokers will reside.

    2. Select the network.

    3. Select subnets in each availability zone for this network. To create a new subnet, click Create new subnet next to the desired availability zone.

      Note

      For a cluster with multiple broker hosts, you need to specify subnets in each availability zone even if you plan to host brokers only in some of them. These subnets are required to host three ZooKeeper hosts — one in each availability zone. For more information, see Resource relationships in Managed Service for Apache Kafka®.

    4. Select security groups to control the cluster's network traffic.

    5. To access broker hosts from the internet, select Public access. In this case, you can only connect to them over an SSL connection. For more information, see Connecting to topics in an Apache Kafka® cluster.

      Warning

      You can't request public access after creating a cluster.

  7. Under Hosts:

    1. Specify the number of Apache Kafka® broker hosts to be located in each of the selected availability zones.

      When choosing the number of hosts, keep in mind that:

      • The Apache Kafka® cluster hosts will be evenly deployed in the selected availability zones. Decide on the number of zones and hosts per zone based on the required fault tolerance model and cluster load.
      • Replication is possible if there are at least two hosts in the cluster.
      • If you selected local-ssd or network-ssd-nonreplicated under Storage, you need to add at least 3 hosts to the cluster.
      • Adding more than one host to the cluster automatically adds three ZooKeeper hosts.
      1. (Optional) Select groups of dedicated hosts to host the cluster on.

      Alert

      You cannot edit this setting after you create a cluster. The use of dedicated hosts significantly affects cluster pricing.

  8. If you specify two or more broker hosts, under Host class ZooKeeper, specify the characteristics of the hostsZooKeeper to be located in each of the selected availability zones.

  9. If necessary, configure additional cluster settings:

    Deletion protection: Enable this option to protect a cluster from accidental deletion by your cloud's users.

    Cluster deletion protection will not prevent a manual connection to a cluster to delete data.

  10. If necessary, configure the Apache Kafka® settings.

  11. Click Create cluster.

  12. Wait until the cluster is ready: its status on the Managed Service for Apache Kafka® dashboard changes to Running and its state to Alive. This may take some time.

If you don't have the Yandex Cloud command line interface yet, install and initialize it.

The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name or --folder-id parameter.

  1. View a description of the CLI create cluster command:

    yc managed-kafka cluster create --help
    
  2. Specify the cluster parameters in the create command (only some of the supported parameters are given in the example):

    yc managed-kafka cluster create \
       --name <cluster name> \
       --environment <prestable or production> \
       --version <version: 2.1, 2.6, or 2.8> \
       --network-name <network name> \
       --brokers-count <number of brokers in the zone> \
       --resource-preset <host class> \
       --disk-type <network-hdd | network-ssd | local-ssd | network-ssd-nonreplicated> \
       --disk-size <storage size in GB> \
       --assign-public-ip <public access> \
       --security-group-ids <list of security group IDs> \
       --deletion-protection=<protect cluster from deletion: true or false>
    

    Cluster deletion protection will not prevent a manual connection to a cluster to delete data.

    If necessary, you can also configure the Apache Kafka® settings here.

  3. To manage topics via the Apache Kafka® Admin API:

    Alert

    Once you create a cluster, you cannot change the Manage topics via the API setting.

    1. When creating a cluster, set the --unmanaged-topics parameter to true:

      yc managed-kafka cluster create \
         ...
         --unmanaged-topics true
      
    2. After creating a cluster, create an administrator account.

  4. To create a cluster hosted using groups of dedicated hosts, list their IDs separated by commas in the --host-group-ids parameter when creating a cluster:

    yc managed-kafka cluster create \
       ...
       --host-group-ids <IDs of dedicated host groups>
    

    Alert

    You cannot edit this setting after you create a cluster. The use of dedicated hosts significantly affects cluster pricing.

With Terraform, you can quickly create a cloud infrastructure in Yandex Cloud and manage it by configuration files. They store the infrastructure description in HashiCorp Configuration Language (HCL). Terraform and its providers are distributed under the Mozilla Public License.

For more information about the provider resources, see the documentation on the Terraform site or mirror site.

If you change the configuration files, Terraform automatically determines which part of your configuration is already deployed and what should be added or removed.

If you don't have Terraform, install it and configure the provider.

To create a cluster:

  1. In the configuration file, describe the parameters of resources that you want to create:

    • Apache Kafka® cluster: Description of a cluster and its hosts. If necessary, you can also configure the Apache Kafka® settings here.

    • Network: Description of the cloud network where a cluster will be located. If you already have a suitable network, you don't have to describe it again.

    • Subnets: Description of the subnets to connect the cluster hosts to. If you already have suitable subnets, you don't have to describe them again.

    Example configuration file structure:

    terraform {
      required_providers {
        yandex = {
         source = "yandex-cloud/yandex"
        }
      }
    }
    
    provider "yandex" {
      token     = "<OAuth or static key of service account>"
      cloud_id  = "<cloud ID>"
      folder_id = "<folder ID>"
      zone      = "<availability zone>"
    }
    
    resource "yandex_mdb_kafka_cluster" "<cluster name>" {
      environment         = "<environment: PRESTABLE or PRODUCTION>"
      name                = "<cluster name>"
      network_id          = "<network ID>"
      security_group_ids  = ["<list of cluster security group IDs>"]
      deletion_protection = <protect cluster from deletion: true or false>
    
      config {
        assign_public_ip = "<public access to the cluster: true or false>"
        brokers_count    = <number of brokers>
        version          = "<Apache Kafka version: 2.1, 2.6, or 2.8>"
        schema_registry  = "<data schema management: true or false>"
        kafka {
          resources {
            disk_size          = <storage size in GB>
            disk_type_id       = "<storage type: network-ssd, network-hdd, network-ssd-nonreplicated, or local-ssd>"
            resource_preset_id = "<host class>"
          }
        }
    
        zones = [
          "<availability zones>"
        ]
      }
    }
    
    resource "yandex_vpc_network" "<network name>" {
      name = "<network name>"
    }
    
    resource "yandex_vpc_subnet" "<subnet name>" {
      name           = "<subnet name>"
      zone           = "<availability zone>"
      network_id     = "<network ID>"
      v4_cidr_blocks = ["<range>"]
    }
    

    Cluster deletion protection will not prevent a manual connection to a cluster to delete data.

  2. Make sure the settings are correct.

    1. Using the command line, navigate to the folder that contains the up-to-date Terraform configuration files with an infrastructure plan.

    2. Run the command:

      terraform validate
      

      If there are errors in the configuration files, Terraform will point to them.

  3. Create a cluster.

    1. Run the command to view planned changes:

      terraform plan
      

      If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.

    2. If you are happy with the planned changes, apply them:

      1. Run the command:

        terraform apply
        
      2. Confirm the update of resources.

      3. Wait for the operation to complete.

    After this, all the necessary resources will be created in the specified folder and the IP addresses of the VMs will be displayed in the terminal. You can check resource availability and their settings in the management console.

For more information, see the Terraform provider documentation.

Use the create API method and pass the following information in the request:

  • The ID of the folder where the cluster should be placed in the folderId parameter.

  • The cluster name in the name parameter.

  • Security group IDs in the parameter securityGroupIds.

  • Cluster deletion protection settings in the deletionProtection parameter.

    Cluster deletion protection will not prevent a manual connection to a cluster to delete data.

To manage topics via the Apache Kafka® Admin API:

Alert

Once you create a cluster, you cannot change the Manage topics via the API setting.

  1. Pass true for the unmanagedTopics parameter.
  2. After creating a cluster, create an administrator account.

To manage data schemas using Managed Schema Registry, pass the true value for the configSpec.schemaRegistry parameter.

Warning

You cannot edit the Data Schema Registry setting after a cluster is created.

To create a cluster hosted using groups of dedicated hosts, pass their IDs in the hostGroupIds parameter.

Alert

You cannot edit this setting after you create a cluster. The use of dedicated hosts significantly affects cluster pricing.

Warning

If you specified security group IDs when creating a cluster, you may also need to re-configure security groups to connect to the cluster.

Examples

Creating a single-host cluster

CLI
Terraform

Let's say we need to create a Managed Service for Apache Kafka® cluster with the following characteristics:

  • With the name mykf.
  • In the production environment.
  • With Apache Kafka® version 2.6.
  • In the default network.
  • In the security group enp6saqnq4ie244g67sb.
  • With one s2.micro host in the ru-central1-c availability zone.
  • With one broker.
  • With 10 GB of SSD network storage (network-ssd).
  • With public access.
  • With accidental cluster deletion protection.

Run the command:

yc managed-kafka cluster create \
   --name mykf \
   --environment production \
   --version 2.6 \
   --network-name default \
   --zone-ids ru-central1-c \
   --brokers-count 1 \
   --resource-preset s2.micro \
   --disk-size 10 \
   --disk-type network-ssd \
   --assign-public-ip \
   --security-group-ids enp6saqnq4ie244g67sb \
   --deletion-protection=true

Let's say we need to create a Managed Service for Apache Kafka® cluster with the following characteristics:

  • In the cloud with the ID b1gq90dgh25bebiu75o.
  • In the folder with the ID b1gia87mbaomkfvsleds.
  • With the name mykf.
  • In the PRODUCTION environment.
  • With Apache Kafka® version 2.6.
  • In the new mynet network with the subnet mysubnet.
  • In the new security group mykf-sg allowing connection to the cluster from the Internet via port 9091.
  • With one s2.micro host in the ru-central1-c availability zone.
  • With one broker.
  • With 10 GB of SSD network storage (network-ssd).
  • With public access.
  • With accidental cluster deletion protection.

The configuration file for the cluster looks like this:

terraform {
  required_providers {
    yandex = {
      source = "yandex-cloud/yandex"
    }
  }
}

provider "yandex" {
  token     = "<OAuth or static key of service account>"
  cloud_id  = "b1gq90dgh25bebiu75o"
  folder_id = "b1gia87mbaomkfvsleds"
  zone      = "ru-central1-c"
}

resource "yandex_mdb_kafka_cluster" "mykf" {
  environment         = "PRODUCTION"
  name                = "mykf"
  network_id          = yandex_vpc_network.mynet.id
  security_group_ids  = [ yandex_vpc_security_group.mykf-sg.id ]
  deletion_protection = true

  config {
    assign_public_ip = true
    brokers_count    = 1
    version          = "2.6"
    kafka {
      resources {
        disk_size          = 10
        disk_type_id       = "network-ssd"
        resource_preset_id = "s2.micro"
      }
    }

    zones = [
      "ru-central1-c"
    ]
  }
}

resource "yandex_vpc_network" "mynet" {
  name = "mynet"
}

resource "yandex_vpc_subnet" "mysubnet" {
  name           = "mysubnet"
  zone           = "ru-central1-c"
  network_id     = yandex_vpc_network.mynet.id
  v4_cidr_blocks = ["10.5.0.0/24"]
}

resource "yandex_vpc_security_group" "mykf-sg" {
  name       = "mykf-sg"
  network_id = yandex_vpc_network.mynet.id

  ingress {
    description    = "Kafka"
    port           = 9091
    protocol       = "TCP"
    v4_cidr_blocks = [ "0.0.0.0/0" ]
  }
}

Was the article helpful?

Language / Region
© 2022 Yandex.Cloud LLC
In this article:
  • How to create a Managed Service for Apache Kafka® cluster
  • Examples
  • Creating a single-host cluster