Yandex Cloud
  • Services
  • Solutions
  • Why Yandex Cloud
  • Blog
  • Pricing
  • Documentation
  • Contact us
Get started
Language / Region
Yandex project
© 2023 Yandex.Cloud LLC
Yandex DataSphere
  • Getting started
  • Tutorials
    • Integrating with Yandex Data Proc
  • Step-by-step instructions
    • All instructions
    • Community management
      • Create a community
      • Add a user to a community
      • Add a communication channel
      • Link a billing account
      • Viewing service usage details
      • Delete a community
    • Project management
      • Create a project
      • Add a user to a project
      • Select a Python version
      • Resize project storage
      • Change project settings
      • Set project limits
      • Delete a notebook or project
    • Working in a notebook
      • Select computing resources
      • Install packages
      • Reset the interpreter state
      • Notebook code snippets
      • Working with Git
      • Run a Docker container in a separate cell
    • Working with resources
      • Using secrets
      • Working with checkpoints
      • Working with Data Proc templates
      • Working with Docker images
      • Working with datasets
    • Connecting to data sources
      • Connecting to S3 storage
      • Connecting to a ClickHouse database
      • Connecting to a PostgreSQL database
      • Connecting to Yandex Disk
      • Connecting to Google Drive
    • Deploying models
      • Creating a node with Python code
      • Updating a node
      • Deleting a node
      • Configuring the node environment
      • Sending requests to nodes
      • Creating an alias
      • Updating an alias
      • Deleting an alias
    • Shared access
      • Publishing a notebook
      • Exporting a project
    • Launching distributed training
    • How to migrate to the new interface
  • Concepts
    • About Yandex DataSphere
    • Relationships between resources in DataSphere
    • Communities
    • Cost management
    • Project and its environment
      • Project
      • Computing resource configurations
      • Interpreter state
      • List of pre-installed software
      • Early access mode
    • Computing and model training
      • Available commands
      • #pragma service commands
      • Background operations
      • EA: Special background cells
      • Distributed training
      • Computing on Apache Spark™ clusters
    • Resources
      • Overview
      • Secrets
      • Docker images
      • Checkpoints
      • Datasets
      • Data Proc templates
    • Using models
    • Quotas and limits
  • Access management
  • Pricing policy
  • Public materials
  • Releases
  • Questions and answers
  1. Step-by-step instructions
  2. Connecting to data sources
  3. Connecting to a ClickHouse database

Connecting to a ClickHouse database

Written by
Yandex Cloud
  • Before you begin
  • Connecting to a host

In the Yandex Cloud infrastructure, ClickHouse server clusters are deployed and supported using Managed Service for ClickHouse.

Before you begin

  1. Create a new Managed Service for ClickHouse cluster and enable public access to it from the host. You can also use an existing cluster with publicly available hosts.
  2. Configure cluster security groups.
  3. Open the project DataSphere:

    1. Select the desired project in your community or on the DataSphere homepage in the Recent projects tab.

    2. Click Open project in JupyterLab and wait for it to load.
    3. Open the notebook tab.

Connecting to a host

To connect to Managed Service for ClickHouse cluster hosts:

  1. Get an SSL certificate: To do this, enter the following command in a notebook cell:

    #!:bash
    mkdir ~/.clickhouse-client
    wget "https://storage.yandexcloud.net/cloud-certs/CA.pem" -O ~/.clickhouse-client/root.crt && \
    chmod 0600 ~/.clickhouse-client/root.crt
    
  2. Establish a connection to the database. To do this, enter the following command in a notebook cell:

    Using the requests library
    Using the clickhouse-driver library
    import requests
    url = 'https://{host}:8443/?database={db}&query={query}'.format(
            host='<FQDN of ClickHouse host>',
            db='<DB name>',
            query='SELECT version()')
    auth = {
            'X-ClickHouse-User': '<DB username>',
            'X-ClickHouse-Key': '<DB user password>',
        }
    cacert = '/home/jupyter/.clickhouse-client/root.crt'
    rs = requests.get(url, headers=auth, verify=cacert)
    rs.raise_for_status()
    print(rs.text)
    

    A successful cluster connection and test query will display the ClickHouse version:

    21.3.13.9
    
    from clickhouse_driver import Client
    client = Client(host='<FQDN of ClickHouse host>',
                    user='<DB username>',
                    password='<DB user password>',
                    database='<DB name>',
                    secure=True)
    client.execute('SELECT version()')
    

    A successful cluster connection and test query will display the ClickHouse version:

    [('21.3.13.9',)]
    

Was the article helpful?

Language / Region
Yandex project
© 2023 Yandex.Cloud LLC
In this article:
  • Before you begin
  • Connecting to a host