Yandex Cloud
  • Services
  • Solutions
  • Why Yandex Cloud
  • Blog
  • Pricing
  • Documentation
  • Contact us
Get started
Language / Region
Yandex project
© 2023 Yandex.Cloud LLC
Yandex Managed Service for Apache Kafka®
  • Getting started
  • Step-by-step instructions
    • All instructions
    • Information about existing clusters
    • Creating clusters
    • Connecting to a cluster
    • Stopping and starting clusters
    • Upgrading the Apache Kafka® version
    • Changing cluster settings
    • Managing Apache Kafka® hosts
    • Working with topics and partitions
    • Managing Apache Kafka® users
    • Managing connectors
    • Viewing cluster logs
    • Deleting clusters
    • Monitoring the state of clusters and hosts
  • Practical guidelines
    • All tutorials
    • Setting up Kafka Connect to work with Managed Service for Apache Kafka®
    • Using data format schemas with Managed Service for Apache Kafka®
      • Overview
      • Working with the managed schema registry
      • Using Confluent Schema Registry with Managed Service for Apache Kafka®
    • Migrating databases from a third-party Apache Kafka® cluster
    • Moving data between Managed Service for Apache Kafka® clusters using Yandex Data Transfer
    • Delivering data from Managed Service for Apache Kafka® using Debezium
    • Delivering data from Yandex Managed Service for MySQL using Debezium
    • Delivering data from Managed Service for Apache Kafka® with Yandex Data Transfer
    • Delivering data to Managed Service for ClickHouse
    • Data delivery in ksqlDB
    • Delivering data to Yandex Managed Service for YDB using Yandex Data Transfer
  • Concepts
    • Relationships between service resources
    • Topics and partitions
    • Brokers
    • Producers and consumers
    • Managing data schemas
    • Host classes
    • Network in Managed Service for Apache Kafka®
    • Quotas and limits
    • Disk types
    • Connectors
    • Maintenance
    • Apache Kafka® settings
  • Access management
  • Pricing policy
  • API reference
    • Authentication in the API
    • gRPC
      • Overview
      • ClusterService
      • ConnectorService
      • ResourcePresetService
      • TopicService
      • UserService
      • OperationService
    • REST
      • Overview
      • Cluster
        • Overview
        • create
        • delete
        • get
        • list
        • listHosts
        • listLogs
        • listOperations
        • move
        • rescheduleMaintenance
        • start
        • stop
        • streamLogs
        • update
      • Connector
        • Overview
        • create
        • delete
        • get
        • list
        • pause
        • resume
        • update
      • ResourcePreset
        • Overview
        • get
        • list
      • Topic
        • Overview
        • create
        • delete
        • get
        • list
        • update
      • User
        • Overview
        • create
        • delete
        • get
        • grantPermission
        • list
        • revokePermission
        • update
      • Operation
        • Overview
        • get
  • Revision history
  • Questions and answers
  1. Step-by-step instructions
  2. Creating clusters

Creating a Apache Kafka® cluster

Written by
Yandex Cloud
  • How to create a Managed Service for Apache Kafka® cluster
  • Examples
    • Creating a single-host cluster

A cluster in Managed Service for Apache Kafka® is one or more broker hosts where topics and their partitions are located. Producers and consumers can work with these topics by connecting to cluster hosts.

Note

  • The number of broker hosts you can create together with a Apache Kafka® cluster depends on the selected disk type and host class.
  • Available disk types depend on the selected host class.

Warning

If you create a cluster with more than one host, three dedicated ZooKeeper hosts will be added to the cluster. For more information, see Relationship between resources in Managed Service for Apache Kafka®.

How to create a Managed Service for Apache Kafka® cluster

Prior to creating a cluster, calculate the minimum storage size for topics.

Management console
CLI
Terraform
API
  1. In the management console, go to the desired folder.

  2. In the list of services, select Managed Service for Apache Kafka®.

  3. Click Create cluster.

  4. Under Basic parameters:

    1. Enter a name for the cluster and, if necessary, a description. The cluster name must be unique within the folder.

    2. Select the environment where you want to create the cluster (you can't change the environment once the cluster is created):

      • PRODUCTION: For stable versions of your apps.
      • PRESTABLE: For testing, including the Managed Service for Apache Kafka® service itself. The Prestable environment is first updated with new features, improvements, and bug fixes. However, not every update ensures backward compatibility.
    3. Select the Apache Kafka® version.

    4. To manage topics via the Apache Kafka® Admin API:

      1. Enable Manage topics via the API.
      2. After creating a cluster, create an admin user.

      Alert

      Once you create a cluster, you cannot change the Manage topics via the API setting.

    5. To manage data schemas using Managed Schema Registry, enable the Data Schema Registry setting.

      Warning

      You cannot edit the Data Schema Registry setting after a cluster is created.

  5. Under Host class, select the platform, host type, and host class.

    The host class defines the technical capabilities of the virtual machines that Apache Kafka® brokers are deployed on. All available options are listed under Host classes.

    By changing the host class for a cluster, you also change the characteristics of all the existing instances.

  6. Under Storage:

    • Select the disk type.

      The selected type determines the increment that you can change your storage size in:

      • Local SSD storage for Intel Broadwell and Intel Cascade Lake: In increments of 100 GB.
      • Local SSD storage for Intel Ice Lake: In increments of 368 GB.
      • Non-replicated SSD storage: In increments of 93 GB.

      You can't change the disk type for Managed Service for Apache Kafka® clusters after creation.

    • Select the size of storage to be used for data.

  7. Under Network settings:

    1. Select one or more availability zones to host Apache Kafka® brokers. If you create a cluster with one accessibility zone, you will not be able to increase the number of zones and brokers in the future.

    2. Select the network.

    3. Select subnets in each availability zone for this network. To create a new subnet, click Create new subnet next to the desired availability zone.

      Note

      For a cluster with multiple broker hosts, you need to specify subnets in each availability zone even if you plan to host brokers only in some of them. These subnets are required to host three ZooKeeper hosts — one in each availability zone. For more information, see Resource relationships in Managed Service for Apache Kafka®.

    4. Select security groups to control the cluster's network traffic.

    5. To access broker hosts from the internet, select Public access. In this case, you can only connect to them over an SSL connection. You can't request public access after creating a cluster. For more information, see Connecting to topics in an Apache Kafka® cluster.

  8. Under Hosts:

    1. Specify the number of Apache Kafka® broker hosts to be located in each of the selected availability zones.

      When choosing the number of hosts, keep in mind that:

      • The Apache Kafka® cluster hosts will be evenly deployed in the selected availability zones. Decide on the number of zones and hosts per zone based on the required fault tolerance model and cluster load.
      • Replication is possible if there are at least two hosts in the cluster.
      • If you selected local-ssd or network-ssd-nonreplicated under Storage, you need to add at least 3 hosts to the cluster.
      • Adding more than one host to the cluster automatically adds three ZooKeeper hosts.
    2. (Optional) Select groups of dedicated hosts to host the cluster on.

      Alert

      You cannot edit this setting after you create a cluster. The use of dedicated hosts significantly affects cluster pricing.

  9. If you specify two or more broker hosts, then under ZooKeeper host class, specify the characteristics of the ZooKeeper hosts to place in each of the selected availability zones.

  10. If necessary, configure additional cluster settings:

    • Maintenance window: Settings for the maintenance window:

      • To enable maintenance at any time, select arbitrary (default).
      • To specify the preferred maintenance start time, select by schedule and specify the desired day of the week and UTC hour. For example, you can choose a time when cluster load is lightest.

      Maintenance operations are carried out both on enabled and disabled clusters. They may include updating the DBMS version, applying patches, and so on.

    • Access from Data Transfer: Enable this option to allow access to the cluster from Yandex Data Transfer in Serverless mode.

      This will enable you to connect to Yandex Data Transfer running in Kubernetes via a special network. It will also cause other operations to run faster, such as transfer launch and deactivation.

    • Deletion protection: Manages cluster protection from accidental deletion by a user.

      Cluster deletion protection will not prevent a manual connection to a cluster to delete data.

  11. If necessary, configure the Apache Kafka® settings.

  12. Click Create cluster.

  13. Wait until the cluster is ready: its status on the Managed Service for Apache Kafka® dashboard changes to Running and its state to Alive. This may take some time.

If you don't have the Yandex Cloud command line interface yet, install and initialize it.

The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name or --folder-id parameter.

  1. View a description of the CLI's create cluster command:

    yc managed-kafka cluster create --help
    
  2. Specify the cluster parameters in the create command (only some of the supported parameters are given in the example):

    yc managed-kafka cluster create \
      --name <cluster name> \
      --environment <environment: prestable or production> \
      --version <Apache Kafka® version: 2.8, 3.0, 3.1, or 3.2> \
      --network-name <network name> \
      --brokers-count <number of brokers in zone> \
      --resource-preset <host class> \
      --disk-type <network-hdd | network-ssd | local-ssd | network-ssd-nonreplicated> \
      --disk-size <storage size, GB> \
      --assign-public-ip <public access> \
      --security-group-ids <security group ID list> \
      --deletion-protection=<cluster deletion protection: true or false>
    

    Tip

    If necessary, you can also configure the Apache Kafka® settings here.

    Cluster deletion protection will not prevent a manual connection to a cluster to delete data.

  3. To set up a maintenance window (including windows for disabled clusters), pass the required value in the --maintenance-window parameter when creating your cluster:

    yc managed-kafka cluster create \
    ...
       --maintenance-window type=<maintenance type: anytime or weekly>,`
                           `day=<day of week for weekly>,`
                           `hour=<hour for weekly>
    

    Where:

    • type: Maintenance type:
      • anytime: Any time.
      • weekly: On a schedule.
    • day: Day of the week in DDD format for weekly. For example, MON.
    • hour: Hour in HH format for weekly. For example, 21.
  4. To manage topics via the Apache Kafka® Admin API:

    1. When creating a cluster, set the --unmanaged-topics parameter to true:

      yc managed-kafka cluster create \
        ...
        --unmanaged-topics true
      

      You cannot edit this setting after you create a cluster.

    2. After creating a cluster, create an admin user.

  5. To allow access to the cluster from Yandex Data Transfer in Serverless mode, pass the --datatransfer-access parameter.

    This will enable you to connect to Yandex Data Transfer running in Kubernetes via a special network. It will also cause other operations to run faster, such as transfer launch and deactivation.

  6. To create a cluster hosted on groups of dedicated hosts, specify the host IDs as a comma-separated list in the --host-group-ids parameter when creating the cluster:

    yc managed-kafka cluster create \
      ...
      --host-group-ids <IDs of dedicated host groups>
    

    Alert

    You cannot edit this setting after you create a cluster. The use of dedicated hosts significantly affects cluster pricing.

With Terraform, you can quickly create a cloud infrastructure in Yandex Cloud and manage it by configuration files. They store the infrastructure description in HashiCorp Configuration Language (HCL). Terraform and its providers are distributed under the Mozilla Public License.

For more information about the provider resources, see the documentation on the Terraform site or mirror site.

If you change the configuration files, Terraform automatically determines which part of your configuration is already deployed and what should be added or removed.

If you don't have Terraform, install it and configure the provider.

To create a cluster:

  1. In the configuration file, describe the parameters of resources that you want to create:

    • Apache Kafka® cluster: Description of a cluster and its hosts. If necessary, you can also configure the Apache Kafka® settings here.

    • Network: Description of the cloud network where a cluster will be located. If you already have a suitable network, you don't have to describe it again.

    • Subnets: Description of the subnets to connect the cluster hosts to. If you already have suitable subnets, you don't have to describe them again.

    Example configuration file structure:

    terraform {
      required_providers {
        yandex = {
         source = "yandex-cloud/yandex"
        }
      }
    }
    
    provider "yandex" {
      token     = "<OAuth or static key of service account>"
      cloud_id  = "<cloud ID>"
      folder_id = "<folder ID>"
      zone      = "<availability zone>"
    }
    
    resource "yandex_mdb_kafka_cluster" "<cluster name>" {
      environment         = "<environment: PRESTABLE or PRODUCTION>"
      name                = "<cluster name>"
      network_id          = "<network ID>"
      security_group_ids  = ["<list of cluster security group IDs>"]
      deletion_protection = <cluster deletion protection: true or false>
    
      config {
        assign_public_ip = "<cluster public access: true or false>"
        brokers_count    = <number of brokers>
        version          = "<Apache Kafka® version: 2.8, 3.0, 3.1, or 3.2>"
        schema_registry  = "<data schema management: true or false>"
        kafka {
          resources {
            disk_size          = <storage size, GB>
            disk_type_id       = "<disk type>"
            resource_preset_id = "<host class>"
          }
        }
    
        zones = [
          "<availability zones>"
        ]
      }
    }
    
    resource "yandex_vpc_network" "<network name>" {
      name = "<network name>"
    }
    
    resource "yandex_vpc_subnet" "<subnet name>" {
      name           = "<subnet name>"
      zone           = "<availability zone>"
      network_id     = "<network ID>"
      v4_cidr_blocks = ["<range>"]
    }
    

    Cluster deletion protection will not prevent a manual connection to a cluster to delete data.

    To set up the maintenance window (for example, for disabled clusters), add the maintenance_window section to the cluster description:

    resource "yandex_mdb_kafka_cluster" "<cluster name>" {
      ...
      maintenance_window {
        type = <maintenance type: ANYTIME or WEEKLY>
        day  = <day of the week for the WEEKLY type>
        hour = <hour of the day for the WEEKLY type>
      }
      ...
    }
    

    Where:

    • type: Maintenance type:
      • anytime: Anytime.
      • weekly: By schedule.
    • day: Day of the week for the weekly type in DDD format. For example, MON.
    • hour: Hour of the day for the weekly type in the HH format. For example, 21.
  2. Make sure the settings are correct.

    1. Using the command line, navigate to the folder that contains the up-to-date Terraform configuration files with an infrastructure plan.

    2. Run the command:

      terraform validate
      

      If there are errors in the configuration files, Terraform will point to them.

  3. Create a cluster.

    1. Run the command to view planned changes:

      terraform plan
      

      If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.

    2. If you are happy with the planned changes, apply them:

      1. Run the command:

        terraform apply
        
      2. Confirm the update of resources.

      3. Wait for the operation to complete.

    After this, all the necessary resources will be created in the specified folder and the IP addresses of the VMs will be displayed in the terminal. You can check that the resources are there with the correct settings, using the management console.

For more information, see the Terraform provider documentation.

Warning

The Terraform provider limits the amount of time for all Managed Service for Apache Kafka® cluster operations to complete to 60 minutes.

Operations exceeding the set timeout are interrupted.

How do I change these limits?

Add the timeouts block to the cluster description, for example:

resource "yandex_mdb_kafka_cluster" "<cluster name>" {
  ...
  timeouts {
    create = "1h30m" # 1 hour 30 minutes
    update = "2h"    # 2 hours
    delete = "30m"   # 30 minutes
  }
}

Use the create API method and pass the following information in the request:

  • In the folderId parameter, the ID of the folder where the cluster should be placed.

  • The cluster name in the name parameter.
    * Security group identifiers, in the securityGroupIds parameter.

  • Settings for the maintenance window (including for disabled clusters) in the maintenanceWindow parameter.

  • Cluster deletion protection settings in the deletionProtection parameter.

    Cluster deletion protection will not prevent a manual connection to a cluster to delete data.

To manage topics via the Apache Kafka® Admin API:

  1. Pass true for the unmanagedTopics parameter. You cannot edit this setting after you create a cluster.
  2. After creating a cluster, create an admin user.

To manage data schemas using Managed Schema Registry, pass the true value for the configSpec.schemaRegistry parameter. You cannot edit this setting after you create a cluster.

To allow access to the cluster from Yandex Data Transfer in Serverless mode, pass true for the configSpec.access.dataTransfer parameter.

This will enable you to connect to Yandex Data Transfer running in Kubernetes via a special network. It will also cause other operations to run faster, such as transfer launch and deactivation.

To create a cluster deployed on groups of dedicated hosts, pass a list of host IDs in the hostGroupIds parameter.

Alert

You cannot edit this setting after you create a cluster. The use of dedicated hosts significantly affects cluster pricing.

Warning

If you specified security group IDs when creating a cluster, you may also need to configure security groups to connect to the cluster.

Examples

Creating a single-host cluster

CLI
Terraform

Create a Managed Service for Apache Kafka® cluster with test characteristics:

  • With the name mykf.
  • In the production environment.
  • With Apache Kafka® version 3.2.
  • In the default network.
  • In the security group enp6saqnq4ie244g67sb.
  • With one s2.micro host in the ru-central1-a availability zone.
  • With one broker.
  • With a network SSD storage (network-ssd) of 10 GB.
  • With public access.
  • With protection against accidental cluster deletion.

Run the following command:

yc managed-kafka cluster create \
  --name mykf \
  --environment production \
  --version 3.2 \
  --network-name default \
  --zone-ids ru-central1-a \
  --brokers-count 1 \
  --resource-preset s2.micro \
  --disk-size 10 \
  --disk-type network-ssd \
  --assign-public-ip \
  --security-group-ids enp6saqnq4ie244g67sb \
  --deletion-protection=true

Create a Managed Service for Apache Kafka® cluster with test characteristics:

  • In the cloud with the ID b1gq90dgh25bebiu75o.
  • In the folder with the ID b1gia87mbaomkfvsleds.
  • With the name mykf.
  • In the PRODUCTION environment.
  • With Apache Kafka® version 3.2.
  • In the new mynet network with the subnet mysubnet.
    * In the new security group mykf-sg allowing connection to the cluster from the Internet via port 9091.
  • With one s2.micro host in the ru-central1-a availability zone.
  • With one broker.
  • With a network SSD storage (network-ssd) of 10 GB.
  • With public access.
  • With protection against accidental cluster deletion.

The configuration file for the cluster looks like this:

terraform {
  required_providers {
    yandex = {
      source = "yandex-cloud/yandex"
    }
  }
}

provider "yandex" {
  token     = "<OAuth or static key of service account>"
  cloud_id  = "b1gq90dgh25bebiu75o"
  folder_id = "b1gia87mbaomkfvsleds"
  zone      = "ru-central1-a"
}

resource "yandex_mdb_kafka_cluster" "mykf" {
  environment         = "PRODUCTION"
  name                = "mykf"
  network_id          = yandex_vpc_network.mynet.id
  security_group_ids  = [ yandex_vpc_security_group.mykf-sg.id ]
  deletion_protection = true

  config {
    assign_public_ip = true
    brokers_count    = 1
    version          = "3.2"
    kafka {
      resources {
        disk_size          = 10
        disk_type_id       = "network-ssd"
        resource_preset_id = "s2.micro"
      }
    }

    zones = [
      "ru-central1-a"
    ]
  }
}

resource "yandex_vpc_network" "mynet" {
  name = "mynet"
}

resource "yandex_vpc_subnet" "mysubnet" {
  name           = "mysubnet"
  zone           = "ru-central1-a"
  network_id     = yandex_vpc_network.mynet.id
  v4_cidr_blocks = ["10.5.0.0/24"]
}

resource "yandex_vpc_security_group" "mykf-sg" {
  name       = "mykf-sg"
  network_id = yandex_vpc_network.mynet.id

  ingress {
    description    = "Kafka"
    port           = 9091
    protocol       = "TCP"
    v4_cidr_blocks = [ "0.0.0.0/0" ]
  }
}

Was the article helpful?

Language / Region
Yandex project
© 2023 Yandex.Cloud LLC
In this article:
  • How to create a Managed Service for Apache Kafka® cluster
  • Examples
  • Creating a single-host cluster