Relationships between resources in Data Transfer
Yandex Data Transfer helps transfer data between DBMS, object stores, and message brokers. This way you can reduce the migration period and minimize downtime when switching to a new database.
Yandex Data Transfer is configurable via Yandex Cloud standard interfaces.
The service is suitable for creating a permanent replica of the database. The transfer of the database schema from the source to the target is automated.
Endpoint
Endpoint is a configuration used to connect to the data source service or target service. In addition to connection settings, the endpoint may contain information about which data will be involved in the transfer and how it should be processed during the transfer.
The following can be the data source or target:
Service | Source | Target |
---|---|---|
ClickHouse database — your own or as part of the Managed Service for ClickHouse service | ||
MongoDB database — your own or as part of the Managed Service for MongoDB service | ||
MySQL database — your own or as part of the Managed Service for MySQL service | ||
PostgreSQL database — your own or as part of the Managed Service for PostgreSQL service | ||
Apache Kafka® topic — your own or as part of the Managed Service for Apache Kafka® service | ||
Yandex Data Streams data stream | ||
Managed Service for YDB database — as part of the Managed Service for YDB service | ||
Yandex Object Storage bucket |
Transfer
Transfer is the process of transmitting data between the source and target service. It should be in the same folder as the endpoints used.
Transfer types
The following types of transfers are available:
- Copy: Moves a snapshot of the source to the target.
- Replicate: Continuously receives changes from the source and applies them to the target. Initial data synchronization is not performed.
- Copy and replicate: Transfers the current state of the source to the target and keeps it up-to-date.
For more information about the differences between transfer types, see Transfer lifecycle.
Compatibility of sources and targets
Different DBMS systems can act as a source and as a target. Possible source and target combinations:
Source \ Target | Apache Kafka® | PostgreSQL | MySQL | MongoDB | Managed Service for YDB | ClickHouse | Object Storage | Yandex Data Streams |
---|---|---|---|---|---|---|---|---|
PostgreSQL | Replicate1 | Copy, replicate | - | - | Copy1, replicate1 | Copy1, replicate1 | Copy1 | Replicate1 |
MySQL | Replicate1 | - | Copy, replicate | - | Copy1, replicate1 | Copy1, replicate1 | Copy1 | Replicate1 |
MySQL | - | - | Copy, replicate | - | Copy1, replicate1 | Copy1, replicate1 | Copy1 | - |
MongoDB | - | - | - | Copy1, replicate1 | - | - | Copy1 | - |
Oracle | - | Copy1, replicate1 | - | - | - | Copy1, replicate1 | - | - |
ClickHouse | - | - | - | - | - | Copy1 | - | - |
Yandex Data Streams | - | - | - | - | Replicate1 | Copy1, replicate1 | Replicate1 | - |
Apache Kafka® | - | - | - | - | Replicate1 | Replicate1 | Replicate1 | - |
1 This feature is in the Preview stage.
Specifics of the service's work with sources and targets
ClickHouse
If replication is enabled on a ClickHouse target, the engines for recreating tables are selected depending on the source type:
- When transferring data from string DBMS, the ReplicatedReplacingMergeTree and ReplacingMergeTree engines are used.
- When transferring data from ClickHouse, the ReplicatedMergeTree engines are used.
Bandwidth
The speed for copying data can reach 15 MBps. It usually takes 2-3 hours to copy a 100 GB database. The exact time depends on the target settings.
When you replicate data, the bandwidth may be up to 20-30 thousand transactions per second.