Taking Leaves Offline with Cluster Downtime

Steps for Offline Maintenance

Step 1: Prior to performing any offline maintenance, SingleStore recommends that you take a backup as a standard precautionary measure. Refer Back Up and Restore Data.

Step 2: Run the following commands from the master aggregator :

SHOW LEAVES

SHOW AGGREGATORS

SHOW CLUSTER STATUS

EXPLAIN RESTORE REDUNDANCY

EXPLAIN REBALANCE PARTITIONS

From the output of these commands, confirm that the following are true.

  • All leaves are online.

  • All aggregators are online.

  • There are no partitions with the “orphan” role.

REBALANCE or RESTORE REDUNDANCY commands need not be run.

Step 3: From the master aggregator, run the SNAPSHOT DATABASE command for each database.

Step 4: From the master aggregator, run the _SYNC_SNAPSHOT <databasename>; command for each database

This command makes sure that a snapshot is triggered on the replica and the replica also has a successful snapshot.

Step 5: Use sdb-admin stop-node to stop all the SingleStore nodes in the following sequence: first the master aggregator, then child aggregators, and finally the leaves.

If you are not able to stop the nodes in the specified order then you need to run SET GLOBAL leaf_failure_detection=OFF; from the master aggregator before stopping the nodes.

Step 6: Perform the maintenance tasks on the hosts as required.

Step 7: Use sdb-admin start-node to start all the SingleStore nodes in the following sequence: first, the leaf nodes, then the child aggregators, and finally the master aggregator. Note that the leaf nodes must be fully recovered before the child aggregators are started.

If you had turned leaf_failure_detection off in step 5, then run SET GLOBAL leaf_failure_detection=ON; from the master aggregator.

Last modified: March 8, 2024

Was this article helpful?