Migrating databases to Managed Service for PostgreSQL
- Logical replication
- Restore a database from a dump
To migrate your database to Managed Service for PostgreSQL, you need to transfer the data directly, close the old database for writing, and then transfer the load to the database cluster in Yandex.Cloud.
There are two ways to migrate data to a Managed Service for PostgreSQL cluster:
- Logical replication is the recommended method (subscriptions). The subscription mechanism that logical replication is built on lets data be transferred to the Managed Service for PostgreSQL cluster with minimal downtime.
- Restoring a database from a dump using
Below, the DBMS server you are migrating data from is called the source server, and the Managed Service for PostgreSQL cluster that you are migrating to is the destination server.
The instructions assume that you are familiar with basic Linux administration.
Logical replication is supported as of PostgreSQL version 10. In addition to migrating data between the same DBMS versions, logical replication allows you to migrate from PostgreSQL version 10 to 11: after configuring replication from the source server with PostgreSQL 10 to the destination server with PostgreSQL 11, just follow the migration steps.
In Managed Service for PostgreSQL clusters, subscriptions can be used by the database owner (a user created simultaneously with the cluster) and users with the
mdb.admin role for the cluster.
- Configure the source server.
- Export the database schema on the source.
- Create a destination cluster and restore the database schema.
- Create a PostgreSQL publication and subscription.
- Move the PostgreSQL sequence after replication.
- Disable replication and transfer the load.
Configure the data source server
Specify the necessary SSL and WAL settings in the
postgresql.conffile. In Debian and Ubuntu, the default path to this file is
We recommend using SSL for migrating data: this will not only help encrypt data, but also compress it. For more information, see SSL Support and Database Connection Control Functions in the PostgreSQL documentation.
To enable SSL, set the desired value in the configuration:
ssl = on # on, off
The setting can be changed in
postgresql.conf. Find the line with the
wal_levelsetting, comment it out if necessary, and set the value as
wal_level = logical # minimal, replica, or logical
Configure host authentication on the source. To do this, add the cluster hosts in Yandex.Cloud to the file
pg_hba.conf(on Debian and Ubuntu distributions, the default path is
For this the lines that will allow incoming connections to the database from the specified hosts should be added.
If you use SSL:
hostssl all all <host addres> md5 hostssl replication all <host address> md5
If you don't use SSL:
host all all <host address> md5 host replication all <host address> md5
If the source server has a firewall, allow incoming connections from the Managed Service for PostgreSQL cluster hosts. For example, for Ubuntu 18:
sudo ufw allow from <host address> to any port 5432
Restart the database server to apply all these settings:
sudo systemctl restart postgresql
Check the PostgreSQL status after restarting:
sudo systemctl status postgresql
Export the DB schema from the source
pg_dump utility, create a file with the database schema to be applied in the Managed Service for PostgreSQL cluster.
pg_dump -h <DBMS server address> - U <username> -p <port> --schema-only --no-privileges --no-subscriptions-d <database name> -Fd-f/tmp/db_dump
This command excludes all data associated with privileges and roles in order to avoid conflicts with the database settings in Yandex.Cloud. If your database requires additional users, create them.
Create a Managed Service for PostgreSQL cluster and restore the database schema
If you don't have a PostgreSQL cluster in Yandex.Cloud, create one. When creating a cluster, specify the same database name that's on the source server.
Restore the schema in the new cluster:
pg_restore -Fd -v --single-transaction -s --no-privileges \ -h <source address> \ -U <username> \ -p 6432 \ -d <database name> /tmp/db_dump
Create a publication and subscription
For logical replication to work, you need to define a publication (a group of logically replicated tables) on the source server and a subscription (a description of a connection to another database) on the receiving server.
On the source server, create a publication for all database tables. When migrating multiple databases, you need to create a separate publication for each of them.
You need superuser rights to create publications to all tables, but not to transfer the selected tables. For more information about creating publications, see the PostgreSQL documentation.
CREATE PUBLICATION p_data_migration FOR ALL TABLES;
On the Managed Service for PostgreSQL cluster host, create a subscription with a connection string to the publication. For more information about creating subscriptions, see the PostgreSQL documentation.
Request with SSL enabled:
CREATE SUBSCRIPTION s_data_migration CONNECTION 'host=<source server address> port=<port> user=<username> sslmode=verify-full dbname=<database name>' PUBLICATION p_data_migration;
CREATE SUBSCRIPTION s_data_migration CONNECTION 'host=<source server address> port=<port> user=<username> sslmode=disable dbname=<database name>' PUBLICATION p_data_migration;
You can monitor replication status from folders
pg_subscription_rel. You can get the general replication status of the receiving server using
pg_stat_subscriptionand on the source server using
select * from pg_subscription_rel;
First of all, you should monitor replication status on the receiving server with the
srsubstatefield means that synchronization has ended and the databases are ready to be replicated.
Migrate PostgreSQL sequences after replication
To fully complete synchronization between the source and the destination, disable writing new data on the source server and transfer the PostgreSQL sequences to the Managed Service for PostgreSQL cluster:
Export PostgreSQL sequences from the source:
pg_dump -h <DBMS server address> -U <username> -p <port> -d <database name> \ --data-only -t '*.*_seq' > /tmp/seq-data.sql
Pay attention to the pattern used: if the database you're migrating has sequences that don't match the
*.*_seqpattern, you have to enter a different pattern to upload them. For more information about patterns, see the PostgreSQL documentation.
Restore sequences on the Managed Service for PostgreSQL host:
psql -h <DBMS server address> -U <username> -p 6432 -d <database name> \ < /tmp/seq-data.sql
Disable replication and transfer the load
When replication is complete and you move the sequences, remove the subscription on the destination server (in the Managed Service for PostgreSQL cluster):
DROP SUBSCRIPTION s_data_migration;
Afterwards, the load can be transferred to the receiving server. Since transferring sequences is a relatively quick and easily automated process, migrating to Managed Service for PostgreSQL is possible with minimal downtime.
Restore a database from a dump
To migrate data from an existing PostgreSQL database to Managed Service for PostgreSQL, use
pg_restore: create a dump of a working database and restore it to the Yandex.Cloud PostgreSQL cluster.
Before trying to import your data, check whether the DBMS versions of the existing database and your cluster in Yandex.Cloud match. If not, you won't be able to restore the created dump. To migrate from version 10 to 11, you can use logical replication.
- Create a dump of the database you want to migrate.
- Create a virtual machine in Yandex.Cloud and upload the database dump to it (optional).
- Create a cluster Managed Service for PostgreSQL.
- Restore data from the dump to the cluster.
Create a database dump
Use pg_dump to create a database dump.
Before creating a dump, we recommend switching the database to
read-onlyto avoid losing data that might appear while the dump is created. The database dump itself is created using the following command:
$ pg_dump -h <DBMS server address> -U <username> -Fd -d <DB name> -f ~/db_dump
To speed up the process, you can start dumping with multiple processor cores. To do this, set the
-jflag with the number equal to the number of cores available to the DBMS:
$ pg_dump -h <DBMS server address> -U <username> -j 4 -Fd -d <database name> ~/db_dump
Archive the dump:
$ tar -cvzf db_dump.tar.gz ~/db_dump
Fore more information about
pg_dump, see the PostgreSQL documentation.
(optional) Create a virtual machine in Yandex.Cloud and upload a dump to it
Transfer your data to an intermediate VM in Compute Cloud if:
- Your Managed Service for PostgreSQL cluster is not accessible from the internet.
- Your hardware or connection to the cluster in Yandex.Cloud is not very reliable.
The required amount of RAM and processor cores depends on the amount of data to migrate and the required migration speed.
To prepare the virtual machine to restore the dump:
In the management console, create a new virtual machine from the Ubuntu 18.04 image. The VM parameters depend on the size of the database you want to migrate. The minimum configuration (1 core, 2 GB RAM, 10 GB disk space) should be sufficient to migrate a database that's up to 1 GB in size. The bigger the database being migrated, the more RAM and storage space you need (at least twice as large as the size of the database).
The virtual machine must be in the same network and availability zone as the PostgreSQL cluster. Additionally, the VM must be assigned an external IP address so that you can load the dump from outside Yandex.Cloud.
Install the PostgreSQL client and additional utilities for working with the DBMS:
$ sudo apt install postgresql-client-common $ sudo apt install postgresql-client-10 # For PostgreSQL 10 $ sudo apt install postgresql-client-11 # For PostgreSQL 11
Move the DB dump to the VM. For example, you can use
scp ~/db_dump.tar.gz <VM username>@<VM public address>:/tmp/db_dump.tar.gz
Unpack the dump:
tar -xzf /tmp/db_dump.tar.gz
Create a cluster Managed Service for PostgreSQL
Make sure that the computing power and storage size of the cluster are appropriate for the environment,where the existing databases are deployed, and create a cluster.
Restore data in the new environment
Use the pg_restore utility to restore your DB dump.
The version of
pg_restore must match the
pg_dump version, and the major version must be at least as high as on the DB where the dump is deployed. For example,
pg_restore 10 should be used with PostgreSQL 10, but if you need to deploy the dump for PostgreSQL 11, use
pg_restore 11 instead.
Best practice is to restore data with the
--single-transaction flag to avoid an inconsistent state of the database if an error occurs:
pg_restore -Fd \ -v \ -h <pgsql_host_address> \ -U <username> -d <database_name> \ -p 6432 \ /tmp/db_dump \ --single-transaction \ --no-privileges
If you only need to restore a single schema, add the
-n <schema name> flag.
Fore more information about
pg_restore, see the PostgreSQL documentation.