Specifics of working with endpoints

Written by

Updated at January 16, 2024

ClickHouse®
Greenplum®
MongoDB
MySQL
PostgreSQL
- Data transfer rate
Yandex Data Streams
Oracle

Yandex Data Transfer has some performance limitations and specifics depending on the endpoint types.

ClickHouse®

Snapshot and Snapshot and increment transfers (in the copy step) from ClickHouse® to ClickHouse® do not support VIEW objects. In source endpoints of the ClickHouse® type, VIEW objects must be on the "List of excluded tables" if the "List of included tables" is empty or not specified. If the "List of included tables" is not empty, it must not contain VIEW objects.

The source supports MATERIALIZED VIEW objects but handles them as regular tables. This means that in transfers from ClickHouse® to ClickHouse®, MATERIALIZED VIEW items are transferred as tables rather than MATERIALIZED VIEW objects.

If replication is enabled on a ClickHouse® target, the engines for recreating tables are selected depending on the source type:

When transferring data from row-oriented database management systems, the ReplicatedReplacingMergeTree and ReplacingMergeTree engines are used.
When transferring data from ClickHouse®, the ReplicatedMergeTree engines are used.

Greenplum®

Transfers from Greenplum® to Greenplum® and from Greenplum® to PostgreSQL do not support moving a schema in the current Yandex Data Transfer version. If there are user-defined table data types in these transfers, create these data types in the target database manually before starting a transfer. To manually transfer a schema, use pg_dump.

The source treats FOREIGN TABLE and EXTERNAL TABLE as a regular view and uses the general algorithm for VIEW objects when handling them.

Data stored in a MATERIALIZED VIEW is not transferred. To transfer MATERIALIZED VIEW data, create an ordinary VIEW that refers to the MATERIALIZED VIEW to be transferred.

MongoDB

By default, the service does not shard collections transferred to a sharded cluster. For more information, see Preparing for the transfer.

Transfers to MongoDB do not migrate indexes. When a transfer changes its status to Replicating, manually create an index for each sharded collection:

db.<collection_name>.createIndex(<index_properties>)

For more information about the createIndex() function, see the MongoDB documentation.

MySQL

Types of data in MySQL target tables may have the unsigned property:

unsigned smallint: Values greater than 2^31 are too large to be stored on the target.
unsigned bigint: Values greater than 2^63 are too large to be stored on the target.

A primary key in MySQL cannot be a string of unlimited length.

The MySQL target ignores the source schema name and creates tables in the schema whose name is specified in the endpoint settings.

PostgreSQL

When transferring data of the TIMESTAMP WITHOUT TIME ZONE type, the time zone set in the timezone parameter of the PostgreSQL source database applies.

Example

The column with the TIMESTAMP WITHOUT TIME ZONE data type contains the 1970-01-01 00:00:00 value. The way the value will be transferred depends on the timezone parameter in the database:

If the parameter is set to Etc/UTC, the time will be transferred as 1970-01-01 00:00:00+00.
If the parameter is set to Europe/Moscow, the time will be transferred as 1970-01-01 00:00:00+03.

Data stored in a MATERIALIZED VIEW is not transferred. To transfer MATERIALIZED VIEW data, create an ordinary VIEW that refers to the MATERIALIZED VIEW to be transferred.

If the definition of the VIEW to be transferred contains an invocation of the VOLATILE function, the transfer reads data from this VIEW with the READ UNCOMMITTED isolation level. No consistency between the VIEW data and the data of other objects being transferred is guaranteed. Reading data from a MATERIALIZED VIEW in the VIEW definition are equivalent to invoking the VOLATILE function.

The source treats FOREIGN TABLE as a regular view and uses the general algorithm for views when handling them.

If the source of a transfer from PostgreSQL to PostgreSQL has a non-empty "List of included tables" specified, user-defined data types in these tables are not transferred. If this is the case, transfer your custom data types manually.

When transferring partitioned tables, take the following into account:

For tables partitioned with declarative partitioning:
- The user needs access to the master table and all its partitions on the source.
- The transfer is performed based on the as is principle: all partitions and the master table will be created on the target.
- At the copying stage, partitions are transferred to the target independently of each other. To speed up their transfer, configure parallel copy.
- At the replication stage, data will automatically be placed into the required partitions.
- If new partitions are created on the source after the transfer has entered the replication stage, you need to transfer them to the target manually.
- The user can only transfer a part of the partitions to the target. To do this, the user must add these partitions to the List of included tables or close access to unnecessary partitions on the source.
For tables partitioned with the inheritance method:
- The user needs access to the parent table and all child tables.
- At the copying stage, data from the child tables is not duplicated in the parent table. To transfer data from the child tables, they must be explicitly specified in the list of tables to be transferred.
- At the copying stage, the child tables are transferred to the target independently of each other. To speed up their transfer, configure parallel copy.
- At the replication stage, data will automatically be placed into the required child tables or the parent table if inheritance is not used for partitioning.
- If the child tables are created on the source after the transfer has entered the replication stage, you need to transfer them to the target manually.
When migrating a database from PostgreSQL to another DBMS, the user can enable the Merge inherited tables option in the source endpoint. In this case:
- Only the parent table will be transferred to the target, and it will contain the data of those child tables which were explicitly specified in the list of tables to be transferred.
- The user can still speed up the transfer because child tables from the source are concurrently copied to the common table on the target. To speed up the transfer, enable parallel copy.

Data transfer rate

Average rate of transferring a table to a single stream for transfers from PostgreSQL to:

PostgreSQL: 21 Mbps.
Another DBMS: 3.5 Mbps.

You can speed up data transfers using sharded copying.

Yandex Data Streams

By default, a separate table is created for every partition when data is transferred from Data Streams to ClickHouse®. For all data to be entered in a single table, specify conversion rules in the advanced endpoint settings for the source.

Oracle

The source ignores VIEW and MATERIALIZED VIEW objects in transfers of any type.

ClickHouse® is a registered trademark of ClickHouse, Inc.

Specifics of working with endpoints

ClickHouse®ClickHouse®

Greenplum®Greenplum®

MongoDBMongoDB

MySQLMySQL

PostgreSQLPostgreSQL

Data transfer rateData transfer rate

Yandex Data StreamsYandex Data Streams

OracleOracle

Was the article helpful?