Configuring access to Object Storage from a ClickHouse® cluster
Managed Service for ClickHouse® supports using Yandex Object Storage to:
- Enable ML models, data format schemas, and your own geobase.
- Process data that is stored in object storage if this data is represented in any of the supported ClickHouse® formats
.
To access Object Storage bucket data from a cluster, set up password-free access to the bucket using a service account:
- Connect a service account to a cluster.
- Set up access rights for the service account.
- Get a link to the bucket object, which you can use to perform operations with the cluster data.
See Examples of working with objects.
Connecting a service account to a cluster
-
When creating or updating a cluster, either select an existing service account or create a new one.
-
Make sure that this account is assigned the correct roles from the
storage.*
role group. If needed, assign it the necessary roles, e.g.,storage.viewer
andstorage.uploader
.
Tip
To link Managed Service for ClickHouse® clusters to Object Storage, it's recommended to use service accounts specially created for this purpose. This lets you organize work with any buckets, including those for which public access is undesirable or impossible.
Setting up access rights
-
In the management console
, select the folder where the bucket is located. If there is no bucket, create one and populate it with the required data. -
Select Object Storage.
-
Set up the bucket ACL or object ACL:
- In the list of buckets or objects, select the required object and click
. - Click Bucket ACL or Object ACL.
- In the Select a user drop-down list, specify the service account connected to the cluster.
- Click Add.
- Set the required permissions for the service account from the drop-down list.
- Click Save.
Note
If necessary, revoke access from one or more users by clicking Cancel in the appropriate line.
- In the list of buckets or objects, select the required object and click
Getting a link to an object
To use Managed Service for ClickHouse® to work with data of an object in Object Storage, you need to get a link to this object in the bucket.
A link, such as https://storage.yandexcloud.net/<bucket_name>/<object_name>?X-Amz-Algorithm=...
should be changed to https://storage.yandexcloud.net/<bucket_name>/<object_name>
. To do this, delete all parameters in the query string.
Examples of working with objects
You can use object links in https://storage.yandexcloud.net/<bucket_name>/<object_name>
format to work with geotags and schemas or to use the s3
table function and the S3
table engine.
The S3
table engine is similar to FileSELECT
and INSERT
.
The s3
table function provides the same functionality as the S3
table engine, but you don't need to create a table before using it.
For example, if the Object Storage bucket has a table.tsv
file that stores table data in TSV format, then you can create a table or function that will work with this file. You must set up password-free access and obtain a link to the table.tsv
file first.
-
Create a table:
CREATE TABLE test (n Int32) ENGINE = S3('https://storage.yandexcloud.net/<bucket_name>/table.tsv', 'TSV');
-
Run test queries to the table:
INSERT INTO test VALUES (1); SELECT * FROM test; ┌─n─┐ │ 1 │ └───┘
-
Insert data:
INSERT INTO FUNCTION s3('https://storage.yandexcloud.net/<bucket_name>/table.tsv', 'TSV', 'n Int32') VALUES (1);
-
Run a test query:
SELECT * FROM s3('https://storage.yandexcloud.net/<bucket_name>/table.tsv', 'TSV', 'n Int32'); ┌─n─┐ │ 1 │ └───┘
ClickHouse® is a registered trademark of ClickHouse, Inc