Creating ClickHouse clusters

ClickHouse clusters are one or more database hosts that replication can be configured between.

Important

When creating a ClickHouse cluster with two or more hosts, Managed Service for ClickHouse automatically creates a cluster of three ZooKeeper hosts for managing replication and fault tolerance. These hosts are considered when calculating the resource quotas used by the cloud, and when calculating the cost of the cluster. Read more about replication for ClickHouse.

The number of hosts that can be created with a ClickHouse cluster depends on the storage option selected:

  • When using network drives, you can request any number of hosts (from one to the current quota limit).

  • When using SSDs, you can create at least two replicas along with the cluster (a minimum of two replicas is required to ensure fault tolerance). If the available folder resources are still sufficient after creating a cluster, you can add extra replicas.

  1. In the management console, select the folder where you want to create a DB cluster.

  2. Выберите сервис Managed Service for ClickHouse.

  3. Click Create cluster.

  4. Enter the cluster name in the Cluster name field. The cluster name must be unique within the folder.

  5. Select the environment where you want to create the cluster (you can't change the environment once the cluster is created):

    • production: For stable versions of your apps.
    • prestable: For testing, including the Managed Service for ClickHouse service itself. The prestable environment is updated more often, which means that known problems are fixed sooner, but this may cause backward incompatible changes.
  6. Select the host class that defines the technical specifications of the VMs where the DB hosts will be deployed. All available options are listed in Host classes. When you change the host class for the cluster, the characteristics of all existing instances change, too.

  7. Under Storage size:

    • Select the type of storage, either a more flexible network type (network-hdd or network-ssd) or faster local SSD storage (local-ssd). The size of the local storage can only be changed in 100 GB increments.
    • Select the size to be used for data and backups. For more information about how backups take up storage space, see Backups.
  8. Under Database, specify the DB attributes:

    • DB name.
    • Username.
    • User password. At least 8 characters.
  9. Under Hosts, specify the parameters for the database hosts created with the cluster (keep in mind that if you use SSDs when creating the ClickHouse cluster, you can set at least two hosts). To change the added host, place the cursor on the host line and click image.

  10. Click Create cluster.

If you don't have the Yandex.Cloud command line interface yet, install it.

The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name or --folder-id parameter.

To create a cluster:

  1. Check whether the folder has any subnets for the cluster hosts:

    $ yc vpc subnet list
    

    If there are no subnets in the folder, create the necessary subnets in VPC.

  2. View a description of the CLI's create cluster command:

    $ yc managed-clickhouse cluster create --help
    
  3. Specify the cluster parameters in the create command (the example shows only mandatory flags):

    $ yc managed-clickhouse cluster create \
       --cluster-name <cluster name> \
       --environment <prestable or production> \
       --network-name <network name> \
       --host type=<clickhouse or zookeeper>,zone-id=<availability zone>,subnet-id=<subnet ID> \
       --resource-preset <host class> \
       --clickhouse-disk-type <network-hdd | network-ssd | local-ssd> \
       --clickhouse-disk-size <storage size in GB> \
       --user name=<username>,password=<user password> \
       --database name=<DB name>
    

    The subnet ID subnet-id should be specified if the selected availability zone contains two or more subnets.

With Terraform, you can quickly create a cloud infrastructure in Yandex.Cloud. The infrastructure components are identified through configuration files that specify the required cloud resources and their parameters.

If you don't have Terraform yet, install it and configure the provider.

To create a cluster:

  1. In the configuration file, describe the parameters of resources that you want to create:

    • Database cluster: Description of the cluster and its hosts.
    • Network: Description of the cloud network where the cluster will be located. If you already have a suitable network, you don't need to describe it again.
    • Subnets: Description of the subnets to connect the cluster hosts to. If you already have suitable subnets, you don't need to describe them again.

    Sample configuration file structure:

    resource "yandex_mdb_clickhouse_cluster" "<cluster name>" {
      name        = "<cluster name>"
      environment = "<environment>"
      network_id  = "<network ID>"
    
      clickhouse {
        resources {
          resource_preset_id = "<host class>"
          disk_type_id       = "<storage type>"
          disk_size          = "<storage size, GB>"
        }
      }
    
      database {
        name = "<DB name>"
      }
    
      user {
        name     = "<DB username>"
        password = "<password>"
        permission {
          database_name = "<name of the DB where the user is created>"
        }
      }
    
      host {
        type      = "CLICKHOUSE"
        zone      = "<availability zone>"
        subnet_id = "<subnet ID>"
      }
    }
    
    resource "yandex_vpc_network" "<network name>" {}
    
    resource "yandex_vpc_subnet" "<subnet name>" {
      zone           = "<availability zone>"
      network_id     = "<network ID>"
      v4_cidr_blocks = ["<range>"]
    }
    

    For more information about resources that you can create using Terraform, see the provider's documentation.

  2. Make sure that the configuration files are correct.

    1. In the command line, go to the folder where you created the configuration file.

    2. Run the check using the command:

      terraform plan
      

    If the configuration is described correctly, the terminal will display a list of created resources and their parameters. If there are errors in the configuration, Terraform will point them out. This is a test step. No resources are created.

  3. Create a cluster.

    1. If there are no errors in the configuration, run the command:

      terraform apply
      
    2. Confirm the creation of resources.

    After this, all the necessary resources will be created in the specified folder and the IP addresses of the VMs will be displayed in the terminal. You can check resource availability and their settings in консоли управления.

Examples

Creating a single-host cluster

To create a cluster with a single host, you should pass a single parameter, --host.

Let's say we need to create a ClickHouse cluster with the following characteristics:

  • Named mych.
  • In the production environment.
  • In the default network.
  • With a single ClickHouse host of the s1.nano class in the b0rcctk2rvtr8efcch64 subnet and the ru-central1-c availability zone.
  • With SSD network storage of 20 GB.
  • With one user, user1, with the password user1user1.
  • With one database, db1.

Run the command:

$ yc managed-clickhouse cluster create \
     --cluster-name mych \
     --environment=production \
     --network-name default \
     --clickhouse-resource-preset s1.nano \
     --host type=clickhouse,zone-id=ru-central1-c,subnet-id=b0cl69g98qumiqmtg12a \
     --clickhouse-disk-size 20 \
     --clickhouse-disk-type network-ssd \
     --user name=user1,password=user1user1 \
     --database name=db1

Creating a single-host cluster

Let's say we need to create a ClickHouse cluster and a network for it with the following characteristics:

  • Named mych.
  • In the PRESTABLE environment.
  • In the cloud with ID b1gq90dgh25иuebiu75o.
  • In a folder named myfolder.
  • In a new network named mynet.
  • With a single s2.micro class host in a new subnet named mysubnet and in the ru-central1-c availability zone. The mysubnet subnet will have the 10.5.0.0/24 range.
  • With 32 GB of fast network storage.
  • With the database name my_db.
  • With the username user1 and password user1user1.

The configuration file for the cluster looks like this:

provider "yandex" {
  token = "<OAuth or static key of service account>"
  cloud_id  = "b1gq90dgh25иuebiu75o"
  folder_id = "${data.yandex_resourcemanager_folder.myfolder.id}"
  zone      = "ru-central1-c"
}

resource "yandex_mdb_clickhouse_cluster" "mych" {
  name        = "mych"
  environment = "PRESTABLE"
  network_id  = "${yandex_vpc_network.mynet.id}"

  clickhouse {
    resources {
      resource_preset_id = "s2.micro"
      disk_type_id       = "network-ssd"
      disk_size          = 32
    }
  }

  database {
    name = "my_db"
  }

  user {
    name     = "user1"
    password = "user1user1"
    permission {
      database_name = "my_db"
    }
  }

  host {
    type      = "CLICKHOUSE"
    zone      = "ru-central1-c"
    subnet_id = "${yandex_vpc_subnet.mysubnet.id}"
  }
}

resource "yandex_vpc_network" "mynet" {}

resource "yandex_vpc_subnet" "mysubnet" {
  zone           = "ru-central1-c"
  network_id     = "${yandex_vpc_network.mynet.id}"
  v4_cidr_blocks = ["10.5.0.0/24"]
}