Delivering data from Greenplum® to PostgreSQL
You can migrate a database from Greenplum® to the PostgreSQL cluster using Yandex Data Transfer.
To transfer a database from Greenplum® to PostgreSQL:
If you no longer need these resources, delete them.
Before you begin
We'll create all the required resources for the example in Yandex Cloud. Prepare the infrastructure:
-
Create a source Yandex Managed Service for Greenplum® cluster of any suitable configuration with the
gp-user
admin username and hosts in the public domain. -
Create a Yandex Managed Service for PostgreSQL target cluster in any applicable configuration with publicly available hosts. When creating a cluster, specify:
- Username:
pg-user
. - Database name:
db1
.
- Username:
-
Make sure that the cluster's security groups have been set up correctly and allow connecting to them:
-
If you don't have Terraform, install and configure it.
-
Download the file with provider settings. Place it in a separate working directory and specify the parameter values.
-
Download the greenplum-postgresql.tf configuration file to the same working directory.
This file describes:
- Networks and subnets for hosting the clusters.
- Security groups to connect to clusters.
- Managed Service for Greenplum® source cluster.
- Managed Service for PostgreSQL target cluster.
- Target endpoint.
- Transfer.
-
In the file
greenplum-postgresql.tf
, specify the Greenplum® and PostgreSQL administrator passwords. -
Run the command
terraform init
in the directory with the configuration file. This command initializes the provider specified in the configuration files and enables you to use the provider resources and data sources. -
Make sure the Terraform configuration files are correct using the command:
terraform validate
If there are errors in the configuration files, Terraform will point to them.
-
Create the required infrastructure:
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console.
-
Set up the transfer
-
Create a source endpoint of the
Greenplum®
type and specify the cluster connection parameters in it:- Connection type:
Managed Service for Greenplum cluster
. - Cluster:
<Greenplum® source cluster name>
from the drop-down list. - Database:
postgres
. - User:
gp-user
. - Password:
<user password>
. - Auxiliary object schema:
public
.
- Connection type:
-
Create a target endpoint and a transfer.
-
Create a target endpoint of the
PostgreSQL
type and specify the cluster connection parameters in it.- Connection type:
Managed Service for PostgreSQL cluster
. - Cluster:
<PostgreSQL target cluster name>
from the drop-down list. - Database:
db1
. - Password:
pg-user
. - Password:
<user password>
.
- Connection type:
-
Create a transfer of the Snapshot type that will use the created endpoints.
Replication is not available for this endpoint pair, but you can set up regular copying when creating a transfer. To do this, in the Transfer parameters Copy field, select Regular and specify the copy interval. This will activate a transfer automatically after the specified time interval.
Warning
Before configuring regular copy, please make sure that the target endpoint parameters include either a
DROP
or aTRUNCATE
cleanup policy. Otherwise, data on the target will be duplicated.
-
Specify the variables in the file
greenplum-postgresql.tf
:gp_source_endpoint_id
: ID of the source endpoint.transfer_enabled
: Set1
to enable transfer creation.
-
Make sure the Terraform configuration files are correct using the command:
terraform validate
If there are errors in the configuration files, Terraform will point to them.
-
Create the required infrastructure:
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
-
Activate the transfer
-
Connect to the Managed Service for Greenplum® cluster, create a table called
x_tab
, and populate it with data:CREATE TABLE x_tab ( id NUMERIC, name CHARACTER(5) ); CREATE INDEX ON x_tab (id); INSERT INTO x_tab (id, name) VALUES (40, 'User1'), (41, 'User2'), (42, 'User3'), (43, 'User4'), (44, 'User5');
-
Activate the transfer and wait for its status to change to Completed.
-
To verify that the data has transferred correctly, connect to the Managed Service for PostgreSQL target cluster and make sure that the columns of the
x_tab
table in thedb1
database match those of the source databasex_tab
table:SELECT id, name FROM db1.public.x_tab;
┌─id─┬─name──┐ │ 40 │ User1 │ │ 41 │ User2 │ │ 42 │ User3 │ │ 43 │ User4 │ │ 44 │ User5 │ └────┴───────┘
Check the copy function upon re-activation
-
In the target endpoint parameters, select either a
DROP
or aTRUNCATE
cleanup policy. -
Delete the row with the
41
ID and edit the row with the42
ID in thex_tab
table:DELETE FROM x_tab WHERE id = 41; UPDATE x_tab SET name = 'Key3' WHERE id = 42;
-
Re-activate the transfer and wait for its status to change to Completed.
-
Make sure that you see the changes in the
x_tab
table on the PostgreSQL target:SELECT id, name FROM db1.public.x_tab;
┌─id─┬─name──┐ │ 42 │ Key3 │ │ 40 │ User1 │ │ 43 │ User4 │ │ 44 │ User5 │ └────┴───────┘
Delete the resources you created
If you no longer need these resources, delete them:
-
Make sure that the transfer's status is Completed and delete it.
-
Delete the clusters:
Manually-
Using Terraform
If you created your resources using Terraform:
-
In the terminal window, change to the directory containing the infrastructure plan.
-
Delete the
greenplum-postgresql.tf
configuration file. -
Make sure the Terraform configuration files are correct using the command:
terraform validate
If there are errors in the configuration files, Terraform will point to them.
-
Confirm the update of resources.
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
All the resources described in the
greenplum-postgresql.tf
configuration file will be deleted. -
-