Migrating databases to Managed Service for PostgreSQL
- Logical replication
- Restore a database from a dump
To migrate your database to Managed Service for PostgreSQL, you need to transfer the data directly, close the old database for writing, and then transfer the load to the database cluster in Yandex.Cloud.
There are two ways to migrate data to a Managed Service for PostgreSQL cluster:
- Logical replication is the recommended method (subscriptions). The subscription mechanism that logical replication is built on lets data be transferred to the Managed Service for PostgreSQL cluster with minimal downtime.
- Restoring a database from a dump using
Below, the DBMS server you are transferring data from is called the source server, and the Managed Service for PostgreSQL cluster that you are migrating to is the destination server.
The instructions assume that you are familiar with basic Linux administration.
Logical replication is supported as of PostgreSQL ver. 10. In addition to migrating data between the same DBMS versions, logical replication allows you to migrate from PostgreSQL version 10 to 11: after configuring replication from the source server with PostgreSQL 10 to the destination server with PostgreSQL 11, just follow the migration steps.
In Managed Service for PostgreSQL clusters, subscriptions can be used by the database owner (a user created simultaneously with the cluster) and users with the
mdb.admin role for the cluster.
- Configure the source server.
- Export the database schema on the source.
- Create a destination cluster and restore the database schema.
- Create a PostgreSQL publication and subscription.
- Transfer the PostgreSQL sequence after replication.
- Disable the replication and transfer the load.
Configure the server with the data source
Specify the necessary SSL and WAL settings in the
postgresql.conffile. In Debian and Ubuntu, the default path to this file is
We recommend using SSL for migrating data: this will not only help encrypt data, but also compress it. For more information, see SSL Support and Database Connection Control Functions in the PostgreSQL documentation.
To enable SSL, set the desired value in the configuration:
ssl = on # on, off
The setting can be changed in
postgresql.conf. Find the line with the
wal_levelsetting, comment it out if necessary, and set the value as
wal_level = logical # minimal, replica, or logical
Configure host authentication on the source. To do this, add the cluster hosts in Yandex.Cloud to the file
pg_hba.conf(on Debian and Ubuntu distributions, the default path is
For this the lines that will allow incoming connections to the database from the specified hosts should be added.
If you use SSL:
hostssl all all <host addres> md5 hostssl replication all <host address> md5
If you don't use SSL:
host all all <host address> md5 host replication all <host address> md5
If the source server has a firewall, allow incoming connections from the Managed Service for PostgreSQL cluster hosts. For example, for Ubuntu 18:
sudo ufw allow from <host address> to any port 5432
Restart the database server to apply all these settings:
sudo systemctl restart postgresql
Check the PostgreSQL status after restarting:
sudo systemctl status postgresql
Export the database schema on the source
pg_dump utility, create a file with the database schema to be applied in the Managed Service for PostgreSQL cluster.
pg_dump -h <DBMS server address> - U <username> -p <port> --schema-only --no-privileges --no-subscriptions-d <database name> -Fd-f/tmp/db_dump
This command excludes all data associated with privileges and roles in order to avoid conflicts with the database settings in Yandex.Cloud. If your database requires additional users, create them.
Create a Managed Service for PostgreSQL cluster and restore the database schema
If you don't have a PostgreSQL cluster in the Cloud yet, create a Managed Service for PostgreSQL cluster. When creating a cluster, specify the same database name that's on the source server.
Restore the schema in the new cluster:
pg_restore -Fd -v --single-transaction -s --no-privileges \ -h <source address> \ -u <username> \ -p 6432 \ -d <database name> /tmp/db_dump
Create a publication and subscription
For logical replication to work, you need to define a publication (a group of logically replicated tables) on the source server and a subscription (a description of a connection to another database) on the destination server.
On the source server, create a publication for all database tables. When migrating multiple databases, you need to create a separate publication for each of them.
You need superuser rights to create publications to all tables, but not to transfer the selected tables. For more information about creating publications, see the PostgreSQL documentation.
CREATE PUBLICATION p_data_migration FOR ALL TABLES;
On the Managed Service for PostgreSQL cluster host, create a subscription with a connection string to the publication. For more information about creating subscriptions, see the PostgreSQL documentation.
Request with SSL enabled:
CREATE SUBSCRIPTION s_data_migration CONNECTION 'host=<source server address> port=<port> user=<username> sslmode=verify-full dbname=<database name>' PUBLICATION p_data_migration;
CREATE SUBSCRIPTION s_data_migration CONNECTION 'host=<source server address> port=<port> user=<username> sslmode=disable dbname=<database name>' PUBLICATION p_data_migration;
You can monitor replication status from folders
pg_subscription_rel. You can get the general replication status of the destination server using
pg_stat_subscriptionand on the source server using
select * from pg_subscription_rel;
First of all, you should monitor replication status on the destination server with the
srsubstatefield means that synchronization has ended and the databases are ready to be replicated.
Migrate PostgreSQL sequences after replication
To fully complete synchronization of the source and destination server, disable writing new data on the source server and transfer PostgreSQL sequences to the Managed Service for PostgreSQL cluster:
Export PostgreSQL sequences on the source:
pg_dump -h <DBMS server address> -U <username> -p <port> -d <database name> \ --data-only -t '*.*_seq' > /tmp/seq-data.sql
Pay attention to the pattern used: if the migrating database has sequences that don't match the
*.*_seqpattern, you have to enter a different pattern to upload them. For more information about patterns, see the PostgreSQL documentation.
Restore sequences on the Managed Service for PostgreSQL host:
psql -h <DBMS server address> -U <username> -p 6432 -d <database name> \ < /tmp/seq-data.sql
Disable replication and transfer the load
When replication is complete and you move the sequences, remove the subscription on the destination server (in the Managed Service for PostgreSQL cluster):
DROP SUBSCRIPTION s_data_migration;
Afterwards, the load can be transferred to the destination server. Since transferring sequences is a relatively quick process and easy automated, migrating to Managed Service for PostgreSQL is possible with minimal downtime.
Restore a database from a dump
To transfer data from an existing PostgreSQL database to Managed Service for PostgreSQL, use
pg_restore: create a dump of a working database and restore it in the Yandex.Cloud PostgreSQL cluster.
Before trying to import your data, check whether the DBMS versions of the existing database and your cluster in Yandex.Cloud match. If not, you won't be able to restore the created dump. To migrate from version 10 to 11, you can use logical replication.
- Create a dump of the migrated database.
- Create a virtual machine in Yandex.Cloud and upload the database dump to it (optional).
- Create a Managed Service for PostgreSQL cluster.
- Restore data from the dump to the cluster.
Create a database dump
Use pg_dump to create a database dump.
Before creating a dump, it is recommended to switch the database to
read-onlyto avoid losing data that might appear while the dump is created. The database dump itself is created using the following command:
$ pg_dump -h <DBMS server address> -U <username> -Fd -d <DB name> -f ~/db_dump
To speed up the process, you can start dumping with multiple processor cores. To do this, set the
-jflag with the number equal to the number of cores available to the DBMS:
$ pg_dump -h <DBMS server address> -U <username> -j 4 -Fd -d <database name> ~/db_dump
Archive the dump:
$ tar -cvzf db_dump.tar.gz ~/db_dump
Fore more information about
pg_dump, see the PostgreSQL documentation.
(optional) Create a virtual machine in Yandex.Cloud and upload a dump to it
Transfer data to an intermediate virtual machine in Compute Cloud if:
- Your Managed Service for PostgreSQL cluster is not accessible from the internet.
- Your hardware or connection to the cluster in Yandex.Cloud is not very reliable.
The required amount of RAM and processor cores depends on the amount of data to migrate and the required migration speed.
To prepare the virtual machine to restore the dump:
In the management console, create a new virtual machine from the Ubuntu 18.04 image. The VM parameters depend on the size of the database you want to migrate. The minimum configuration (1 core, 2 GB RAM, 10 GB disk space) should be sufficient to migrate a database that's up to 1 GB in size. The bigger the migrating database, the more RAM and the bigger the storage space you need (at least twice as large as the size of the database).
The virtual machine must be in the same network and availability zone as the PostgreSQL cluster. Additionally, the VM must be assigned an external IP address so that you can load the dump from outside Yandex.Cloud.
Install the PostgreSQL client and additional utilities for working with the DBMS:
$ sudo apt install postgresql-client-common $ sudo apt install postgresql-client-10 # For PostgreSQL 10 $ sudo apt install postgresql-client-11 # For PostgreSQL 11
Move the DB dump to the VM. For example, you can use the
scp ~/db_dump.tar.gz <VM username>@<VM public address>:/tmp/db_dump.tar.gz
Unpack the dump:
tar -xzf /tmp/db_dump.tar.gz
Create a Managed Service for PostgreSQL cluster
Make sure that the computing power and storage size of the cluster are appropriate for the environment, where the existing databases are deployed, and create a cluster.
Restore data in the new environment
Use the pg_restore utility to restore your DB dump.
The version of
pg_restore must match the
pg_dump version, and the major version must be at least as high as on the DB where the dump is deployed. For example,
pg_restore 10 should be used with PostgreSQL 10, but if you need to deploy the dump for PostgreSQL 11, use
pg_restore 11 instead.
Best practice is to restore data with the
--single-transaction flag to avoid an inconsistent state of the database if an error occurs:
pg_restore -Fd \ -v \ -h <pgsql_host_address> \ -U <username> -d <database_name> \ -p 6432 \ /tmp/db_dump \ --single-transaction \ --no-privileges
If you only need to restore a single schema, add the
-n <schema name> flag.
Fore more information about
pg_restore, see the PostgreSQL documentation.