Yandex Cloud
  • Services
  • Solutions
  • Why Yandex Cloud
  • Blog
  • Pricing
  • Documentation
  • Contact us
Get started
Language / Region
Yandex project
© 2023 Yandex.Cloud LLC
Yandex Data Transfer
  • Available transfers
  • Getting started
  • Step-by-step guide
  • Practical guidelines
  • Concepts
    • Relationships between service resources
    • Transfer types and lifecycles
    • What objects can be transferred
    • Yandex Data Transfer specifics for sources and targets
    • Operations on transfers
    • Network in Yandex Data Transfer
    • Speed for copying data in Yandex Data Transfer
    • Change data capture
    • Sharded copy
    • What tasks is the service used for?
    • Quotas and limits
  • Troubleshooting
  • Access management
  • Pricing policy
  • API reference
  • Questions and answers
  1. Concepts
  2. Yandex Data Transfer specifics for sources and targets

Specifics of working with endpoints

Written by
Yandex Cloud
  • ClickHouse
  • Greenplum®
  • MongoDB
  • PostgreSQL
  • Yandex Data Streams
  • Oracle

Yandex Data Transfer has some performance limitations and specifics depending on the endpoint types.

ClickHouse

Snapshot and Snapshot and increment transfers (in the copy step) from ClickHouse to ClickHouse don't support operations with VIEW objects. In source endpoints of the ClickHouse type, a VIEW must be on the "List of excluded tables" if the "List of included tables" is empty or omitted. If the "List of included tables" is non-empty, it must not contain VIEW objects.

The source supports MATERIALIZED VIEW objects but handles them as regular tables. This means that in transfers from ClickHouse to ClickHouse, MATERIALIZED VIEW are moved as tables and not as MATERIALIZED VIEW objects.

If replication is enabled on a ClickHouse target, the engines for recreating tables are selected depending on the source type:

  • When transferring data from row-oriented database management systems, the ReplicatedReplacingMergeTree and ReplacingMergeTree engines are used.
  • When transferring data from ClickHouse, the ReplicatedMergeTree engines are used.

Greenplum®

Transfers from Greenplum® to Greenplum® and from Greenplum® to PostgreSQL don't support moving a schema in the current Yandex Data Transfer version. If there are user-defined table data types in these transfers, create these data types in the target database manually before starting a transfer. To manually transfer a schema, use pg_dump.

The source treats a FOREIGN TABLE and EXTERNAL TABLE as a regular view and uses the general algorithm for VIEW when handling them.

The source never transfers data from a MATERIALIZED VIEW, even during transfers from Greenplum® to a different database.

MongoDB

By default, the service does not shard collections transferred to a sharded cluster. For more information, see Preparing for the transfer.

Transfers to MongoDB do not migrate indexes. When a transfer changes its status to Replicating, manually create an index for each sharded collection:

db.<collection name>.createIndex(<index properties>)

For more information about the createIndex() function, see the MongoDB documentation.

PostgreSQL

The source never transfers data from a MATERIALIZED VIEW, even during transfers from PostgreSQL to a different database. In transfers from PostgreSQL to PostgreSQL, a MATERIALIZED VIEW is treated as a regular view and handled using the general algorithm for VIEW.

The source treats a FOREIGN TABLE as a regular view and uses the general algorithm for views when handling them.

If the source of a transfer from PostgreSQL to PostgreSQL has a non-empty "List of included tables" specified, user-defined data types that are present in these tables aren't transferred. If this is the case, please transfer your custom data types manually.

When transferring partitioned tables, take the following into account:

  • For tables partitioned with declarative partitioning:

    • The user needs access to the master table and all its partitions on the source.
    • The transfer is performed based on the as is principle: all partitions and the master table will be created on the target.
    • At the copying stage, partitions are transferred to the target independently of each other. This enables the user to speed up the transfer by enabling sharding in the transfer settings.
    • At the replication stage, data will automatically be placed into the required partitions.
    • If new partitions are created on the source after the transfer has entered the replication stage, you need to transfer them to the target manually.
    • The user can only transfer a part of the partitions to the target. To do this, the user must add these partitions to the List of included tables or close access to unnecessary partitions on the source.
  • For tables partitioned with the inheritance method:

    • The user needs access to the parent table and all child tables.
    • At the copying stage, data from the child tables is not duplicated in the parent table. To transfer data from the child tables, they must be explicitly specified in the list of tables to be transferred.
    • At the copying stage, the child tables are transferred to the target independently of each other. This enables the user to speed up the transfer by enabling sharding in the transfer settings.
    • At the replication stage, data will automatically be placed into the required child tables or the parent table if inheritance is not used for partitioning.
    • If the child tables are created on the source after the transfer has entered the replication stage, you need to transfer them to the target manually.

    When migrating a database from PostgreSQL to another DBMS, the user can enable the Merge inherited tables option in the source endpoint. In this case:

    • Only the parent table will be transferred to the target, and it will contain the data of those child tables which were explicitly specified in the list of tables to be transferred.
    • The user can still speed up the transfer by enabling sharding in the transfer settings, because child tables from the source are concurrently copied to the common table on the target.

Yandex Data Streams

By default, a separate table is created for every partition when data is transferred from Data Streams to ClickHouse. For all data to be entered in a single table, specify conversion rules in the advanced endpoint settings for the source.

Oracle

The source ignores VIEW and MATERIALIZED VIEW objects in transfers of any type.

Was the article helpful?

Language / Region
Yandex project
© 2023 Yandex.Cloud LLC
In this article:
  • ClickHouse
  • Greenplum®
  • MongoDB
  • PostgreSQL
  • Yandex Data Streams
  • Oracle