Taking Leaves Offline without Cluster Downtime

Occasionally hosts need to be taken offline for maintenance (upgrading memory, etc). This can present a challenge if these hosts are home to one or more SingleStore leaf nodes.

By following the steps below, you can detach SingleStore leaf nodes from a SingleStore cluster, take the host offline for maintenance, and attach the leaves back to the cluster following maintenance. This can all be done without downtime to the SingleStore cluster.

Assumptions:

  • The steps below assume the host address (IP address or hostname, whichever the nodes on the host are identified by) will not change during maintenance.

  • The steps below assume the cluster is configured for High Availability (redundancy 2). If both leaves in a paired group of leaves are detached from the cluster, the cluster will become unavailable and downtime will be experienced. For this reason only one availability group of leaves should be detached at a time.

Step 1: Check for long running queries

Before removing leaf nodes, confirm that there are no long running queries present in the cluster by running the following SQL statement.

SELECT * FROM information_schema.PROCESSLIST WHERE COMMAND = 'QUERY' AND STATE = 'executing';

Step 2: Ensure all database partitions are balanced

See Understanding Orphaned Partitions to verify if any orphaned partitions exist in the cluster. If there are, this topic explains how to resolve them.

Step 3. Confirm the leaf node you want to take offline has an online paired leaf on a different host

To confirm this, run sdb-admin show-leaves and check the results. Suppose you have a leaf node running on 172.18.1.5 and you want to take it offline. To confirm it has an online paired leaf, run sdb-admin show-leaves and observe that this node’s paired host is 172.18.1.6 and that the paired host is online:

sdb-admin show-leaves
✓ Successfully ran 'memsqlctl show-leaves'
+------------+------+--------------------+------------+-----------+--------+--------------------+--------------------------------+
|    Host    | Port | Availability Group | Pair Host  | Pair Port | State  | Opened Connections | Average Roundtrip Latency (ms) |
+------------+------+--------------------+------------+-----------+--------+--------------------+--------------------------------+
| 172.18.1.5 | 3306 | 1                  | 172.18.1.6 | 3306      | online | 1                  | 1.538                          |
| 172.18.1.5 | 3307 | 1                  | 172.18.1.6 | 3307      | online | 2                  | 0.765                          |
| 172.18.1.6 | 3306 | 2                  | 172.18.1.5 | 3306      | online | 2                  | 0.898                          |
| 172.18.1.6 | 3307 | 2                  | 172.18.1.5 | 3307      | online | 2                  | 1.491                          |
+------------+------+--------------------+------------+-----------+--------+--------------------+--------------------------------+

Step 4: Detach the SingleStore leaf or aggregator node(s) from the host to be taken offline for maintenance

A SingleStore leaf node is detached from a SingleStore cluster by using the following syntax:

DETACH LEAF'host':port;

For more information on this command see the reference.

Note: If both leaves in a paired group of leaves are detached from the cluster will become unavailable and downtime will be experienced. For this reason only one availability group of leaves should be detached at a time.

For host machines running aggregator nodes, use the following syntax to detach an aggregator from a host:

REMOVE AGGREGATOR 'host':port;

For more information on this command see the reference.

Step 5: Stop the SingleStore node(s)

Stop the SingleStore node(s) (leaves and aggregators) residing on all hosts that will be taken offline for maintenance.

sdb-admin stop-node --memsql-id <MemSQL_ID>

For more information on this command see the reference.

Step 6: Take the host offline, perform maintenance, bring host back online and confirm SingleStore is running

It is now safe to power down the host and perform maintenance. After performing maintenance bring the host back online.

Step 7: Start the SingleStore node(s)

Start the SingleStore node(s) (leaves and aggregators) residing on all hosts that were previously taken offline for maintenance and are now back online.

sdb-admin start-node --memsql-id <MemSQL_ID>

Step 8: Attach the SingleStore leaf or aggregator node(s) back to the host that was taken offline for maintenance

Once maintenance is completed, the host is back online and SingleStore is running attach the SingleStore leaf or aggregator node(s) back to the cluster.

A SingleStore leaf node is attached to a SingleStore cluster by using the following command from the master aggregator node:

ATTACH LEAF 'host':port NO REBALANCE;

For more information on this command, see the reference.

Note: If you took multiple leaf nodes offline and are attaching them back to the cluster you can use the reference to attach all detached leaves with one command:

To attach an aggregator node back to a cluster, use the following syntax:

ADD AGGREGATOR user:'password'@'host':port;

For more information on this command see the reference.

Step 9: Rebalance cluster partitions

After attaching the SingleStore leaf node(s) to the host run the following command on your SingleStore master aggregator node:

REBALANCE ALL DATABASES;

For more information on this command, see the REBALANCE ALL DATABASES topic.

Running REBALANCE ALL DATABASES will redistribute data across your cluster. In doing so a portion of data in your cluster will be relocated to the SingleStore nodes that were attached in step 8.

Taking leaf nodes offline without cluster downtime FAQs.

In this section

Last modified: September 18, 2023

Was this article helpful?