Cluster Downsizing Steps
On this page
Downsizing a cluster can make sense for a few reasons:
-
You’ve determined that you can operate a cluster with less resources (which could potentially translate to cost savings)
-
Your data volume is lower than projected
-
Your data volume is higher than projected, which would justify splitting it across two clusters (e.
g. , migrate from a large cluster to a number of smaller clusters)
Regardless of the rationale, downsizing a cluster by removing leaf nodes is a standard, low-risk cluster operation that can be accomplished using SingleStore Tools.
When removing a leaf node, the partitions from the node that is removed will be moved to other nodes in the cluster.
As the database is not being repartitioned, this puts more memory and processing stress on the remaining hosts as the number of the partitions will be sub-optimal for the reduced cluster size.
Notes on Downsizing
Note
If the cluster downsizing is permanent, SingleStore recommends backing up the databases via the mysqldump
command or the Back Up and Restore Data guide, then dropping the existing databases and reimporting the data.
-
Ensure that the cluster is sized properly so the remaining nodes have enough memory to take over data from the removed nodes.
Confirm that the host and leaf configurations are the same across existing leaf nodes.
Sizing configuration (RAM, cores, etc. ) and OS tuning (ulimit settings, THP settings, etc. ) should match to ensure uniform query performance and cluster health. See System Requirements and Recommendations for more details. -
Ensure that there are no long-running queries.
If possible, throttle the application workload so the number of running queries is as low as possible. A best practice would be to halt all running workloads. SELECT * FROM information_schema.PROCESSLIST WHERE COMMAND = 'QUERY' AND STATE = 'executing';
-
Be aware that removing a partition can take time as the partitions need to be copied to other nodes before the node is removed from the cluster.
Please do not attempt to kill the running operation. You should consider addressing any potential data skew issues prior to downsizing the cluster. -
If you are using high availability (HA), you must maintain an even number of nodes and remove paired nodes from the cluster.
You can review which leaf nodes are paired by connecting to the database and running SHOW LEAVES.
Example Cluster
Caution
As the following steps are intended for use with a cluster that is managed by SingleStore Tools, do not run them on a cluster that is managed by MemSQL Ops.
In the example below, a cluster with one Master Aggregator (MA) and six leaf nodes will be reduced to four leaf nodes.
Since the nodes on 10.
and 10.
are paired, these two nodes will be deleted.
Below is a sample cluster prior to modification.
Note: The following commands are run from the command line on the Master Aggregator host (10.
in this example cluster).
sdb-admin list-nodes
+------------+--------+------------+------+---------------+--------------+---------+----------------+--------------------+
| MemSQL ID | Role | Host | Port | Process State | Connectable? | Version | Recovery State | Availability Group |
+------------+--------+------------+------+---------------+--------------+---------+----------------+--------------------+
| E2686600DF | Master | 10.0.3.34 | 3306 | Running | True | 6.8.3 | Online | |
| 79933F2F05 | Leaf | 10.0.0.107 | 3306 | Running | True | 6.8.3 | Online | 1 |
| 7AF82FFA6C | Leaf | 10.0.0.219 | 3306 | Running | True | 6.8.3 | Online | 1 |
| 0EBA8C7CE4 | Leaf | 10.0.1.234 | 3306 | Running | True | 6.8.3 | Online | 2 |
| C2A9AA5641 | Leaf | 10.0.1.67 | 3306 | Running | True | 6.8.3 | Online | 2 |
| CC425D4F59 | Leaf | 10.0.1.75 | 3306 | Running | True | 6.8.3 | Online | 1 |
| D35993788C | Leaf | 10.0.2.35 | 3306 | Running | True | 6.8.3 | Online | 2 |
+------------+--------+------------+------+---------------+--------------+---------+----------------+--------------------+
Remove Leaf Nodes
-
Note: Preserve the output of the
sdb-admin show-leaves
command so that you can refer to it later.sdb-admin show-leaves✓ Successfully ran 'sdb-admin show-leaves' +------------+------+--------------------+------------+-----------+--------+--------------------+--------------------------------+ | Host | Port | Availability Group | Pair Host | Pair Port | State | Opened Connections | Average Roundtrip Latency (ms) | +------------+------+--------------------+------------+-----------+--------+--------------------+--------------------------------+ | 10.0.0.107 | 3306 | 1 | 10.0.1.67 | 3306 | online | 9 | 0.22 | | 10.0.1.67 | 3306 | 2 | 10.0.0.107 | 3306 | online | 9 | 0.245 | | 10.0.1.75 | 3306 | 1 | 10.0.1.234 | 3306 | online | 9 | 0.229 | | 10.0.1.234 | 3306 | 2 | 10.0.1.75 | 3306 | online | 9 | 0.268 | | 10.0.0.219 | 3306 | 1 | 10.0.2.35 | 3306 | online | 9 | 0.258 | | 10.0.2.35 | 3306 | 2 | 10.0.0.219 | 3306 | online | 9 | 0.329 | +------------+------+--------------------+------------+-----------+--------+--------------------+--------------------------------+
Of particular note are Host, Availability Group (AG), and Pair Host.
When possible, delete paired hosts so that the availability groups won’t need to be reconfigured. As
10.
and0. 0. 219 10.
are paired in the example cluster, it’s safe to delete their leaf nodes.0. 2. 35 This will maintain the other pairings and therefore has no effect on HA. The availability groups won’t need to be reconfigured since this operation has no effect on the pairing between 10.
and0. 0. 107 10.
, or0. 1. 67 10.
and0. 1. 75 10.
.0. 1. 234 -
Remove the first leaf node.
sdb-admin remove-leaf --host "10.0.0.219" --port 3306Toolbox will perform the following actions on the local master aggregator on port 3306: · Run `REMOVE LEAF 10.0.0.219:3306` Would you like to continue? [y/N]: y ✓ Removed leaf 10.0.0.219:3306 from cluster
-
Remove the second leaf node.
sdb-admin remove-leaf --host "10.0.2.35" --port 3306Toolbox will perform the following actions on the local master aggregator on port 3306: · Run `REMOVE LEAF 10.0.2.35:3306` Would you like to continue? [y/N]: y ✓ Removed leaf 10.0.2.35:3306 from cluster
-
Confirm that both leaf nodes have been removed.
sdb-admin show-leaves✓ Successfully ran 'sdb-admin show-leaves' +------------+------+--------------------+------------+-----------+--------+--------------------+--------------------------------+ | Host | Port | Availability Group | Pair Host | Pair Port | State | Opened Connections | Average Roundtrip Latency (ms) | +------------+------+--------------------+------------+-----------+--------+--------------------+--------------------------------+ | 10.0.0.107 | 3306 | 1 | 10.0.1.67 | 3306 | online | 9 | 0.204 | | 10.0.1.67 | 3306 | 2 | 10.0.0.107 | 3306 | online | 9 | 0.259 | | 10.0.1.75 | 3306 | 1 | 10.0.1.234 | 3306 | online | 9 | 0.226 | | 10.0.1.234 | 3306 | 2 | 10.0.1.75 | 3306 | online | 9 | 0.256 | +------------+------+--------------------+------------+-----------+--------+--------------------+--------------------------------+
Delete Leaf Nodes
Provided an aggregator is not running on the same host, you may now delete the nodes using the delete-node
command.
For this example, each leaf is on its own host so deleting the entire node is safe.
-
Delete the first leaf node.
sdb-admin delete-node --memsql-id 7AF82FFA6C --stopToolbox is about to stop and then delete the following nodes: - On host 10.0.0.219: + 7AF82FFA6C Would you like to continue? [y/N]: y Operation completed successfully
-
Delete the second leaf node.
sdb-admin delete-node --memsql-id D35993788C --stopToolbox is about to stop and then delete the following nodes: - On host 10.0.2.35: + D35993788C Would you like to continue? [y/N]: y Operation completed successfully
Availability Groups and Removing Leaf Nodes
Note: If you do not remove paired hosts, you will need to reconfigure the availability groups.
For example, if 10.
and 10.
are removed, which are not paired, the following two nodes will reside in availability group 2.
sdb-admin show-leaves
✓ Successfully ran 'sdb-admin show-leaves'
+------------+------+--------------------+-----------+-----------+--------+--------------------+--------------------------------+
| Host | Port | Availability Group | Pair Host | Pair Port | State | Opened Connections | Average Roundtrip Latency (ms) |
+------------+------+--------------------+-----------+-----------+--------+--------------------+--------------------------------+
| 10.0.1.67 | 3306 | 2 | null | null | online | 16 | 0.267 |
| 10.0.1.234 | 3306 | 2 | null | null | online | 16 | 0.229 |
+------------+------+--------------------+-----------+-----------+--------+--------------------+--------------------------------+
The following will occur if an attempt is made to create a new database with the cluster in its current state.
memsql -u root -h localhost -p -e 'create database music';
Enter password:
ERROR 1782 (HY000) at line 1: CREATE DATABASE requires that all availability groups are balanced.
The largest group has 2 leaves, while the smallest has 0. The difference must be at most one.
To remedy this, remove these two leaf nodes and add them back into AG 1 and AG 2 so that they form a pair.
sdb-admin remove-leaf --host "10.0.1.67" --port 3306
Toolbox will perform the following actions on the local master aggregator on port 3306:
· Run `REMOVE LEAF 10.0.1.67:3306`
Would you like to continue? [y/N]: y
✓ Removed leaf 10.0.1.67:3306 from cluster
sdb-admin remove-leaf --host "10.0.1.234" --port 3306Toolbox will perform the following actions on the local master aggregator on port 3306:· Run `REMOVE LEAF 10.0.1.234:3306`Would you like to continue? [y/N]: y✓ Removed leaf 10.0.1.234:3306 from cluster
sdb-admin add-leaf --host 10.0.1.234 --port 3306 --password iamazon --availability-group 1Toolbox will perform the following actions on host 10.0.3.34:· Run 'sdb-admin add-leaf --host 10.0.1.234 --port 3306 --user root --password ●●●●●● --availability-group 1'Would you like to continue? [y/N]: y✓ Successfully ran 'sdb-admin add-leaf'Operation completed successfully
sdb-admin add-leaf --host 10.0.1.67 --port 3306 --password ●●●●●● --availability-group 2
Toolbox will perform the following actions on host 10.0.3.34:
· Run 'sdb-admin add-leaf --host 10.0.1.67 --port 3306 --user root --password ●●●●●● --availability-group 2'
Would you like to continue? [y/N]: y
✓ Successfully ran 'sdb-admin add-leaf'
Operation completed successfully
sdb-admin show-leaves
✓ Successfully ran 'sdb-admin show-leaves'
+------------+------+--------------------+------------+-----------+--------+--------------------+--------------------------------+
| Host | Port | Availability Group | Pair Host | Pair Port | State | Opened Connections | Average Roundtrip Latency (ms) |
+------------+------+--------------------+------------+-----------+--------+--------------------+--------------------------------+
| 10.0.1.234 | 3306 | 1 | 10.0.1.67 | 3306 | online | 1 | 0.251 |
| 10.0.1.67 | 3306 | 2 | 10.0.1.234 | 3306 | online | 2 | 0.249 |
+------------+------+--------------------+------------+-----------+--------+--------------------+--------------------------------+
This configuration will now allow a database to be created.
Unregister Hosts
-
Unregister the first host.
sdb-toolbox-config unregister-host --host "10.0.0.219" -
Unregister the second host.
sdb-toolbox-config unregister-host --host "10.0.2.35"
Rebalance Partitions
Once all of the designated nodes have been removed, run REBALANCE ALL DATABASES.
On the Master Aggregator host:
-
Use the SingleStore Client to connect to SingleStore.
-
Run the SNAPSHOT DATABASE command to bring transaction logs up to date.
As a general rule, taking a snapshot is helpful before any maintenance/clustering operation where data is being moved or recovered as it compresses files by combining multiple statements into one.
-
Run
REBALANCE ALL DATABASES
.
Last modified: January 8, 2024