Yandex Cloud
  • Services
  • Solutions
  • Why Yandex Cloud
  • Pricing
  • Documentation
  • Contact us
Get started
Language / Region
© 2022 Yandex.Cloud LLC
Yandex DataSphere
  • Getting started
  • Step-by-step instructions
    • All instructions
    • Project management
      • Creating a project
      • Choosing a Python version
      • Installing dependencies
      • Managing computing resources
      • Setting up consumption limits for a project
      • Setting up consumption limits for a folder
      • Resizing project storage
      • Changing a name or description
      • Deleting a notebook or project
    • Sharing a notebook
      • Publishing a notebook
      • Exporting a project
    • Working with a notebook
      • Running sample code in a notebook
      • Versioning. Working with checkpoints
      • Clearing the interpreter state
      • Working with Git
    • Managing Docker images
      • Docker image for a project
      • Docker image in a cell
    • Connecting to data sources
      • Connecting to a ClickHouse database
      • Connecting to a PostgreSQL database
      • Connecting to S3 storage
    • Setting up integration with Data Proc
    • Working with confidential data
      • Creating a secret
      • Referencing a secret
      • Updating a secret
      • Copying a secret
      • Destroying a secret
    • Launching distributed training
    • Deploying models
      • Creating a node from a Python code cell
      • Configuring the node environment
      • Queries to nodes
  • Concepts
    • Overview
    • Project
    • List of pre-installed software
    • Available commands
    • #pragma service commands
    • Computing resource configurations
    • Integration with version and data control systems
    • Saving a state
    • Integration with Data Proc
    • Background operations
    • Datasets
    • Private data storage
    • Deploying models
    • Using TensorBoard in Yandex DataSphere
    • Distributed training
    • Cost management
    • Quotas and limits
  • Early access
    • Overview
    • Special background operations
  • Practical guidelines
    • All tutorials
    • Getting started with Yandex DataSphere
    • Voice biometrics
    • Evaluating the quality of STT models
    • Labeling audio files
    • Classification of images in video frames
  • API reference
    • Authentication in the API
    • gRPC
      • Overview
      • AppTokenService
      • FolderBudgetService
      • NodeService
      • ProjectDataService
      • ProjectService
      • OperationService
    • REST
      • Overview
      • AppToken
        • Overview
        • validate
      • FolderBudget
        • Overview
        • get
        • set
      • Node
        • Overview
        • execute
      • Project
        • Overview
        • create
        • delete
        • execute
        • get
        • getCellOutputs
        • getNotebookMetadata
        • getStateVariables
        • getUnitBalance
        • list
        • open
        • setUnitBalance
        • update
  • Access management
  • Pricing policy
  • Public materials
  • Releases
  • Questions and answers
  1. Step-by-step instructions
  2. Connecting to data sources
  3. Connecting to a ClickHouse database

Connecting to a ClickHouse database

Written by
Yandex Cloud
  • Before you start
  • Connecting to a host

In the Yandex Cloud infrastructure, ClickHouse server clusters are deployed and supported using Managed Service for ClickHouse.

To utilize a Managed Service for ClickHouse cluster host as a data source for DataSphere:

  1. Create a new Managed Service for ClickHouse cluster and enable public access to it from the host. You can also use an existing cluster with publicly available hosts.
  2. Configure cluster security groups.

Before you start

If a project is already open, open the tab with a notebook.

If not, open the project:

  1. In the management console, open the DataSphere section in the folder where you work with DataSphere projects.

  2. Go to the Projects tab.
  3. Select the project you want to open and click .
  4. Choose Open and wait for the project to open.

Connecting to a host

To connect to Managed Service for ClickHouse cluster hosts:

  1. Get an SSL certificate: To do this, enter the following command in a notebook cell:

    #!:bash
    mkdir ~/.clickhouse-client
    wget "https://storage.yandexcloud.net/cloud-certs/CA.pem" -O ~/.clickhouse-client/root.crt && \
    chmod 0600 ~/.clickhouse-client/root.crt
    
  2. Establish a connection to the database. To do this, enter the following command in a notebook cell:

    Using the requests library
    Using the clickhouse-driver library
    import requests
    url = 'https://{host}:8443/?database={db}&query={query}'.format(
            host='<FQDN of ClickHouse host>',
            db='<DB name>',
            query='SELECT version()')
    auth = {
            'X-ClickHouse-User': '<DB username>',
            'X-ClickHouse-Key': '<DB user password>',
        }
    cacert = '/home/jupyter/.clickhouse-client/root.crt'
    rs = requests.get(url, headers=auth, verify=cacert)
    rs.raise_for_status()
    print(rs.text)
    

    A successful cluster connection and test query will display the ClickHouse version:

    21.3.13.9
    
    from clickhouse_driver import Client
    client = Client(host='<FQDN of ClickHouse host>',
                    user='<DB username>',
                    password='<DB user password>',
                    database='<DB name>',
                    secure=True)
    client.execute('SELECT version()')
    

    A successful cluster connection and test query will display the ClickHouse version:

    [('21.3.13.9',)]
    

Was the article helpful?

Language / Region
© 2022 Yandex.Cloud LLC
In this article:
  • Before you start
  • Connecting to a host