Maintenance in Managed Service for Greenplum®
There are two classes of maintenance operations in Managed Service for Greenplum®:
- Non-routine operations for cluster maintenance
- Routine operations for database maintenance
Non-routine operations
Non-routine operations involve cluster software updates and host recovery after failures. They may result in changes to cluster settings and a cluster's restart. During these operations current queries will be aborted and incomplete transactions will be canceled.
Non-routine operations related to updates are performed during a maintenance window in a specified order. These operations include:
- Installing minor Greenplum® updates. This results in DBMS restart.
- Installing PXF updates. This results in PXF restart.
- Restarting cluster hosts required for cloud infrastructure scheduled maintenance (replacing failed components, installing system updates, performing scheduled hardware maintenance, etc.).
- Installing security updates on cluster hosts. This results in host restart.
Non-routine operations related to cluster recovery can be performed at any time whenever they are required. These operations include:
- Recovering data after a physical host or non-replicated disk fails in the cloud infrastructure.
- Segment rebalancing
: Resetting preferred segment roles after a host or its segments are restored.
Maintenance window
You can set the preferred maintenance time when creating a cluster or updating its settings:
- The arbitrary option (default) allows performing maintenance at any time.
- The by schedule option allows setting the preferred maintenance start day and time (UTC). For example, you can choose a time when the cluster is least loaded.
Maintenance procedure
Maintenance related to software updates is performed as follows:
- Segment hosts undergo maintenance one by one. The hosts are queued randomly. If a segment host needs to be restarted during maintenance, it becomes unavailable while being restarted.
- Maintenance is performed on the
STANDBY
master host. If it needs to be restarted during maintenance, it becomes unavailable while being restarted. - Maintenance is performed on the
PRIMARY
master host. If it is restarted during maintenance and becomes unavailable, the standby master host will take its role. If you access a cluster using the FQDN of the primary master host, the cluster may become unavailable. To make your application continuously available, access the cluster using a special FQDN always pointing to the primary master host.
Routine operations
Routine operations are required to ensure proper database performance. They are run regularly on a certain schedule and do not abort current queries. These operations include:
- System folder table
VACUUM
. This operation is run three times a day. - Custom table VACUUM.
- Statistics collection.
- Backup.
Custom table VACUUM
Custom tables are vacuumed daily. Databases are handled concurrently in two threads. In each database, tables on which VACUUM has not been run yet are handled first. Then the remaining tables are handled, starting with the one on which VACUUM has not been run the longest.
Two vacuuming modes are supported:
- Sequential: Tables are handled one by one. The total operation execution time is limited with a soft timeout: when it is reached, the vacuuming of the current table is completed, and then the process is terminated.
- Concurrent: Tables are handled in two threads. This mode uses a hard timeout: when it is reached, all vacuuming processes are forced to terminate.
The default mode is sequential. To switch to concurrent table vacuuming mode, contact technical support
The VACUUM
operation start time and timeout are specified in the settings when creating or updating a cluster.
Statistics collection
Statistics collection (the ANALYZE
operation) is performed after the tables are vacuumed. Databases are handled concurrently in two threads. In addition, two threads are run to collect table statistics in each database. As a result, statistics can be collected in four threads.
The analyzedbANALYZE
command on any append-optimized (AO) table that has been modified since the utility collected statistics last, as well as on each and every heap table.
Statistics collection from each database is limited with a timeout which is specified in the settings when creating or updating a cluster. The total statistics collection time is not limited.
Greenplum® and Greenplum Database® are registered trademarks or trademarks of VMware, Inc. in the United States and/or other countries.