Troubleshooting in Data Transfer
- General
- API errors
- Network
- ClickHouse
- MongoDB
- MySQL
- PostgreSQL
- Stopping a transfer's master transaction session
- Exceeding the connection time-to-live quota
- VIEW transfer error
- Error when adding a table entry by constraint
- Error when transferring all schema tables
- Low transfer speed
- Can't transfer child tables
- Insufficient replication slots in a source database
- No data transfer after changing a source endpoint
- Transfer error when changing a master host
- Error when transferring nested transactions
- Error transferring tables with deferred constraints
- Couldn't create a replication slot at the activation step
- Excessive WAL size increase
- Yandex Managed Service for YDB
- Yandex Data Streams
This section describes typical problems that may arise during transfer activation or operation, and the relevant resolution methods.
- General
- API errors
- Network
- ClickHouse
- MongoDB
- MySQL
- PostgreSQL
- Yandex Managed Service for YDB
- Yandex Data Streams
Key features about preparing sources and targets for a transfer are described in the relevant section.
Limitations on sources and targets while a transfer is running are described in Working with databases during the transfer.
General
Long transfer activation
Solution: If a transfer being activated is in the Creating status for a long time, it's not an error. It takes time to create Yandex Compute Cloud resources that are allocated for each transfer individually. For some sources, a database schema is extracted, which also takes time.
Duplicate rows in a target database
Possible cause:
-
A target database contains pre-transfer data.
Solution: delete the data from the target database before the transfer is activated or set the
Drop
cleanup policy in the target endpoint. -
There is no primary key in the target database tables.
Solution: make sure the tables in the target database have primary keys.
Insufficient resources
Error message:
Warn(Activate): Snapshot loading failed:
snapshot tasks failed: main uploader failed:
errors detected on secondary workers: secondary worker #3 failed:
pod instance restarted 5h41m39.092446028s ago
Solution: contact technical support or your account manager to request more transfer resources.
Required user permissions missing
Error message:
Warn(Activate): failed to load schema using pg_dump:
unable to execute pg_dump to get sequence data:
Unable to exec pg_dump: exit status 1;
stderr: pg_dump: error: query failed: ERROR: permission denied for
Solution: prepare a source and reactivate the transfer.
API errors
Error example:
{"code": 13, "message": "internal"}
Troubleshooting: Contact support or your account manager and specify request_id
. If you're using curl
for API calls, add the -v
flag to facilitate error diagnostics.
Network
Overlapping IP address ranges
Error message:
YC: unable to resolve instance group:
unable to resolve net topology: subnet address space collision detected:
subnet <ID of subnet 1> [<IP range of subnet 1>]
collides with subnet <ID of subnet 2> [<IP range of subnet 2>]
The error occurs if the source and target hosts are in different subnets within Yandex Cloud but have overlapping IP address ranges.
Solution: create a new target cluster and make sure the subnets of the target and source cluster hosts involved in a transfer are not overlapping by IP address range.
No server connection
There is no connection because the specified subnet has no preconfigured egress NAT gateway.
Error message:
Can't connect to server: Can't ping server:
dial tcp <address of an endpoint's host>:<port>: connect: connection timed out
A transfer would fail if it has one on_premise
endpoint and another endpoint with the subnet that has no egress NAT gateway.
Solution: disable the endpoint setting that points to the subnet and reactivate the transfer.
Blocking a transfer's IP address
Solution: allow connections to the transfer through addresses and ranges used by Data Transfer.
ClickHouse
No new tables are added
No new tables are added to Snapshot and increment transfers.
Solution:
- Deactivate and reactivate the transfer.
- Create a table in the target database manually.
- Create a separate transfer of the Snapshot type and only add to it the newly created tables. Deactivate the original Snapshot and increment transfer is not required.
MongoDB
Collection key size exceeds 5 MB
Error message:
Warn(replication): Usage of bulk objects in 'database <DB name>'
breaks change event log, transfer is stopping.
Reason: (Location<item number>) Tried to create string longer than 16MB.
If the collection key size exceeds 5 MB, transfers of the Increment type crash due to MongoDB internal limits on the size of user objects.
Solution: exclude any collections that exceed MongoDB limits from the transfer and reactivate it.
Collection object size exceeds 16 MB
Error message:
Warn(replication): Usage of bulk objects in 'collection '<DB name>.<collection name>''
breaks change event log, transfer is stopping.
Reason: (BSONObjectTooLarge) BSONObj size: <object size> (<object size in hex>) is invalid.
Size muse be between 0 and 16793600(16MB).
If the collection object size exceeds 16 MB, transfers of Increment type crash due to MongoDB internal limits on user object size.
Solution: exclude any collections that exceed MongoDB limits from the transfer and reactivate it.
No table found
Error message:
Unable to find any tables
An empty number of collections was extracted from the database. The user might be missing permissions for the database used in the transfer.
Solution: for the database to be transferred, grant the user the transfer uses to connect to the source readWrite
permissions.
Error when transferring a sharded cluster
Solution: in the Runtime environment → Yandex Cloud → Sharded copying parameters transfer parameter, specify the number of instances equal to the number of collections being transferred.
MySQL
A single transaction's log size exceeds 4 GB
Error message:
Last binlog file <file name:file size> is more than 4GB
If the log size of a single transaction exceeds 4 GB, activation of Increment or Snapshot and increment transfers fails due to MySQL internal limits on the size of a single transaction's log.
Solution: reactivate the transfer.
No new tables are added
No new tables are added to Snapshot and increment transfers.
Solution:
- Deactivate and reactivate the transfer.
- Create a table in the target database manually.
- Create a separate transfer of the Snapshot type and only add to it the newly created tables. Deactivate the original Snapshot and increment transfer is not required.
Error when transferring from AWS RDS for MySQL
In transfers of type Snapshot and increment and Increment from the source Amazon RDS for MySQL there may be an error loading tables.
Error example:
Failed to execute LoadSnapshot:
Cannot load table "name": unable to read rows and push by chunks:
unable to push changes: unable to execute per table push:
error: err: sql: transaction has already been committed or rolled back
rollback err: sql: transaction has already been committed or rolled back
The error is caused by the short storage time of the MySQL binary log files in Amazon RDS.
Solution:
Increase the storage time of the binary log using the command:
call mysql.rds_set_configuration('binlog retention hours', <hours>);
The maximum storage time is 168 hours (7 days). The default value is NULL
(binary log files are not saved). For more information, see Amazon RDS Documentation.
PostgreSQL
Stopping a transfer's master transaction session
Error message:
Cannot set transaction snapshot:
ERROR: invalid snapshot identifier: "<snapshot ID>" (SQLSTATE 22023).
Possible cause:
- The source is running a cron job or other application that periodically concludes sessions that are too long.
- Someone manually terminated the master transaction.
- The source's CPU resources are insufficient to complete a query.
Solution: disable this cron job and add more CPU resources to the source. After you make the changes, reactivate the transfer.
Exceeding the connection time-to-live quota
Yandex Managed Service for PostgreSQL has a connection time-to-live quota of 12 hours.
Solution: if transferring a database requires more time, contact technical support to have your quota increased.
VIEW transfer error
Error message:
[ERROR] "... _view": no key columns found
Can't replicate new data from Views. Under PostgreSQL — PostgreSQL transfers, only those views are transferred which are listed in the Table filter → Included tables source endpoint parameter.
Solution: manually exclude all views from the transfer, list them in the Table filter → Included tables source endpoint parameter, and then reactivate the transfer.
Error when adding a table entry by constraint
Solution: prepare the source as described in Preparing for the transfer.
Error when transferring all schema tables
Error message:
Unable to apply DDL of type 'TABLE', name '<schema>'.'<table>', error:
ERROR: schema "<schema name>" does not exist (SQLSTATE 3F000)
A transfer fails if tables of a specific schema are listed as <schema>.*
. This occurs due to the specifics of pg_dump
that is used for a schema transfer. When specifying tables of the entire schema in <schema>.*
format in the source endpoint parameter Table filter → Included tables, the PostgreSQL types from this schema aren't extracted even if it contains tables that depend on these types.
Solution: Create PostgreSQL types in the target database manually.
Low transfer speed
Snapshot or Snapshot and increment transfers from PostgreSQL to PostgreSQL may slow down.
Possible cause:
-
Write protocol.
Under normal operation, a transfer is performed via the fast
copy
protocol. However, if there's a batch write conflict, it switches to slow line-by-line writes. The more write conflicts, the lower the transfer speed.Solution: set the target endpoint's cleanup policy type to
Drop
and exclude other writing processes. -
Parallel table reading.
Parallel reading is only available for tables that contain a primary key in serial mode.
Solution: specify the number of instances and processes in the Runtime environment → Yandex Cloud → Sharded copying parameters transfer parameter and reactivate the transfer.
Can't transfer child tables
Child tables are either not transferred or transferred without data if a table is partitioned.
Solution: set the following source endpoint parameters:
- Select Merge inherited tables in advanced settings.
- In the Included tables field, specify all child tables to transfer data from.
- Make sure the user has access to the child tables.
To speed up a child table transfer using parallel reading, specify the number of instances and processes in the Runtime environment → Yandex Cloud → Sharded copying parameters transfer parameter.
Insufficient replication slots in a source database
Error message:
Warn(Activate): failed to create a replication slot "<transfer ID>" at source:
failed to create a replication slot:
failed to create a replication slot:
ERROR: all replication slots are in use
(SQLSTATE 53400)
Solution: increase the number of replication slots in the source database (by default, 10
).
No data transfer after changing a source endpoint
After adding tables to the List of included tables in the source endpoint parameters, a data transfer restarted and stopped.
Solution:
-
Create tables in the target database manually.
1. Create new tables with a
Primary key
and without aForeign key
in the target database.
2. Add the new tables to the List of included tables in the source endpoint parameters.
3. Transfer a dump with historical data to the target database.
4. If there are errors in the logs, fix them depending on a specific error.
5. If there are no errors but the logs are empty, contact technical support or your account manager for a dump of goroutines. This may help resume replication without a restart. -
Deactivate and reactivate the transfer.
-
Create a separate transfer of the Snapshot type for the new tables. Deactivating the original transfer is not required.
Transfer error when changing a master host
The error occurs in Increment or Snapshot and increment transfers due to required Write-Ahead Log (WAL) parts missing. This occurs when the Write Ahead Log (WAL) logical replication lag between the current master and a replica is greater than the allowable WAL size on other hosts, so, when switching from the master to the replica, the replication slot can't sync to the WAL on the new master.
Solution: set a limit in the additional target endpoint parameter Maximum WAL size for the replication slot and reactivate the transfer.
Error when transferring nested transactions
Transfers of PostgreSQL below version 14 don't support a transfer of tables with committed transactions that are nested more than 1024 times with replication changes at each nesting level. The degree of nesting depends on the number of nested begin; .. commit;
statements.
Solution:
- Use PostgreSQL 14 or higher.
- Exclude transactions with a nesting level like this from a transfer.
Error transferring tables with deferred constraints
The error arises in the transfers of type Increment or Snapshot and increment because updating tables and transactions with DEFERRABLE
constraints isn't supported by Data Transfer. To learn more about deferred constraints, see the PostgreSQL documentation.
Solution: Change the constraint type in such tables to IMMEDIATE
and activate the transfer again.
Couldn't create a replication slot at the activation step
In the beginning of the transfer, one or more replication slots are created in the source database. The database objects are locked at this point. If some object is locked by another transaction, this results in a competing lock, which will terminate the transfer with an error.
Solution:
-
Get the
PID
of the process that competes for locks with the transfer:/* Get PID of the transfer */ SELECT active_pid FROM pg_replication_slots WHERE slot_name = '<transfer ID>'; /* search the PID of the locking process */ SELECT pid, pg_blocking_pids(pid) as blocked_by FROM pg_stat_activity WHERE cardinality(pg_blocking_pids(pid)) > 0;
pid | blocked_by -----------------+------------------- <transfer PID> | {<locking transaction PID>} (1 row)
-
Look up the locking query:
SELECT query, usename FROM pg_stat_activity WHERE pid = <locking transaction PID>;
-
(optional) Stop the transaction by the command:
SELECT pg_terminate_backend(<locking transaction PID>);
Excessive WAL size increase
In the PostgreSQL source database, the size of the Write-Ahead Log (WAL) may reach up to dozens of GB due to the queries running for over five minutes. These queries block the WAL and do not allow it to move forward and reset.
You can see the WAL size has gone up when:
- The space used by the source database has increased.
- The Read buffer size chart in Data Transfer monitoring is rising.
Solution:
-
Find the query sessions running for more than five minutes:
SELECT now()-xact_start, query, pid FROM pg_stat_activity WHERE (now()-xact_start)>'5minute'::interval AND STATE != 'idle' ORDER BY 1 DESC;
-
Terminate the sessions found. Make sure to avoid such queries moving forward.
Yandex Managed Service for YDB
Transfer failure
A Increment or Snapshot and increment transfer is interrupted with an error.
Error message:
/Ydb.PersQueue.V1.PersQueueService/AddReadRule failed: OVERLOADED
Transfers are aborted due to the cloud quota on the number of operations with Managed Service for YDB.
Solution:
- Increase the Number of schema transactions per minute property in the Managed Service for YDB quotas for the cloud with the required database.
- Restart the transfer.
Yandex Data Streams
Transfer failure
A Increment or Snapshot and increment transfer is interrupted with an error.
Error message:
/Ydb.PersQueue.V1.PersQueueService/AddReadRule failed: OVERLOADED
Transfers are aborted due to the cloud quota on the number of operations with Managed Service for YDB.
Solution:
- Increase the Number of schema transactions per minute property in the Managed Service for YDB quotas for the cloud with the required database.
- Restart the transfer.