Enabling High Availability
On this page
The leaf_
global variable defines the placement of replica partitions among leaves in a cluster.paired
and load_
.leaf_
is set to paired
mode.paired
mode, the auto-rebalance operation only affects the newly attached leaves and their pairs.load_
mode, the auto-rebalance operation runs a full rebalance of all the databases on all the leaves in a cluster.
For small clusters, such as a cluster with four leaf nodes, the paired
mode is preferable over the load_
mode.load_
mode on a very small cluster, losing two nodes in different availability groups (AGs) will always take the cluster offline.
Consider the following example:
AG1 has nodes A and B.
Node A has primary partitions 1,2,3,4.
Node B has primary partitions 5,6,7,8.
AG2 has nodes C and D.
Node C has primary partitions 9,10,11,12.
Node D has primary partitions 13,14,15,16.
In the load_
mode, the secondary/backup partitions are balanced as follows:
Node A has backups of 9,10 (from Node C) and 13,14 (from Node D).
Node B has backups of 11,12 (from Node C) and 15,16 (from Node D).
Node C has backups of 1,2 (from Node A) and 5,6 (from Node B).
Node D has backups of 3,4 (from Node A) and 7,8 (from Node B).
If node A goes down, the cluster now requires both nodes C and D to continue serving its partitions, so removal of either of those nodes will take the cluster offline.
If you were in the paired
mode, you will instead have something like the following:
Node A has backups of all of Node C's partitions.
Node B has backups of all of Node D's partitions.
Node C has backups of of Node A's partitions.
Node D has backups of all of Node B's partitions.
In this case, if node A goes down, then you can also lose node D (but not Node C) in the other availability group without causing an outage.paired
mode, you should also be able to lose just A and B.
This is less of a concern on larger clusters, in which the backup partitions are still load balanced across a different AG but don't require every single node in it as in the case of small clusters of just four leaf nodes.load_
and paired
modes - the former balances the backup partition load more evenly, but also makes it more fragile in the event of an outage.
The following sections discuss how to enable high availability for each of these modes.
Enabling High Availability in paired
Mode
When a leaf fails, the partitions on its paired leaf are promoted to master partitions.
Enabling HA on existing leaf nodes
The SingleStoreDB management tools support enabling high availability through the sdb-admin enable-high-availability
command.
sdb-admin enable-high-availability [--password <secure-password>]
During this process, half of the leaves in the cluster from the availability group 1 will move into the availability group 2.
If you need to add more nodes to the cluster, we recommend doing it before enabling High Availability.
You can verify high availability has been enabled by seeing more details on your leaf nodes through sdb-admin list-nodes
.
sdb-admin list-nodes
+------------+------------+-------+------+---------------+--------------+---------+----------------+--------------------+
| MemSQL ID | Role | Host | Port | Process State | Connectable? | Version | Recovery State | Availability Group |
+------------+------------+-------+------+---------------+--------------+---------+----------------+--------------------+
| D24FA96083 | Master | node1 | 3306 | Running | True | 6.7.7 | Online | |
| F3E5FDB741 | Aggregator | node2 | 3306 | Running | True | 6.7.7 | Online | |
| 5B536C630D | Leaf | node3 | 3306 | Running | True | 6.7.7 | Online | 1 |
| 9156C4C91F | Leaf | node4 | 3306 | Running | True | 6.7.7 | Online | 2 |
| 92CC911954 | Leaf | node4 | 3307 | Running | True | 6.7.7 | Online | 2 |
| DD82C2B35E | Leaf | node3 | 3307 | Running | True | 6.7.7 | Online | 1 |
+------------+------------+-------+------+---------------+--------------+---------+----------------+--------------------+
Enabling HA for a cluster with new leaf nodes
When you add new nodes to a cluster that does not have high availability already enabled, you must perform the following steps to change your cluster configuration and also configure the new leaf nodes appropriately.
Step 1. Update the redundancy_ level value
On the Master Aggregator run:
SET @@GLOBAL.redundancy_level = 2;
This updates the current configuration and sets the cluster to run in redundancy-2 operation mode.
In addition, update the SingleStoreDB configuration file on the Master Aggregator to make sure the change is not lost whenever it is restarted.
Unlike in previous versions of SingleStoreDB, redundancy_
can only be set on the Master Aggregator.
sdb-admin list-nodes --role master -q | xargs -I % sdb-admin update-config --memsql-id % --key redundancy_level --value 2 --set-global -y
Step 2. Deploy new nodes
The cluster is now running in redundancy-2 operation mode, however all existing leaves belong to the availability group 1.
At this point, start an equal number of new nodes as the current number of leaf nodes.
The following shows how to add two leaf nodes using sdb-admin.
sdb-admin list-nodes
+------------+------------+-------+------+---------------+--------------+---------+----------------+--------------------+
| MemSQL ID | Role | Host | Port | Process State | Connectable? | Version | Recovery State | Availability Group |
+------------+------------+-------+------+---------------+--------------+---------+----------------+--------------------+
| 43F1B836D3 | Master | node1 | 3306 | Running | True | 6.7.7 | Online | |
| E4921A995C | Aggregator | node1 | 3307 | Running | True | 6.7.7 | Online | |
| 74BBE83C45 | Leaf | node1 | 3308 | Running | True | 6.7.7 | Online | 1 |
| A6D82670D8 | Leaf | node1 | 3309 | Running | True | 6.7.7 | Online | 1 |
+------------+------------+-------+------+---------------+--------------+---------+----------------+--------------------+
For demonstration purposes, the following shows how to add two new local leaf nodes.
sdb-admin create-node --port 3310 -y
sdb-admin create-node --port 3311 -y
Because new nodes default to the role of unknown
, you can assign them as leaf nodes using bash piping.
sdb-admin list-nodes -q -r unknown | xargs -I % sdb-admin add-leaf --memsql-id % -y
Now verify that the new nodes are added, but added in availability group 2.
sdb-admin list-nodes
+------------+------------+-------+------+---------------+--------------+---------+----------------+--------------------+
| MemSQL ID | Role | Host | Port | Process State | Connectable? | Version | Recovery State | Availability Group |
+------------+------------+-------+------+---------------+--------------+---------+----------------+--------------------+
| 43F1B836D3 | Master | node1 | 3306 | Running | True | 6.7.7 | Online | |
| E4921A995C | Aggregator | node1 | 3307 | Running | True | 6.7.7 | Online | |
| 4DAE7D1F54 | Leaf | node1 | 3310 | Running | True | 6.7.7 | Online | 2 |
| 52CA34CD0C | Leaf | node1 | 3311 | Running | True | 6.7.7 | Online | 2 |
| 74BBE83C45 | Leaf | node1 | 3308 | Running | True | 6.7.7 | Online | 1 |
| A6D82670D8 | Leaf | node1 | 3309 | Running | True | 6.7.7 | Online | 1 |
+------------+------------+-------+------+---------------+--------------+---------+----------------+--------------------+
Step 3. Rebalance the cluster
Finally, you need to rebalance the cluster, by running REBALANCE ALL DATABASES.
Note that, before running REBALANCE ALL DATABASES
, only 1 master partition is still available for each database, which means high availability is not yet provided.
SHOW PARTITIONS ON memsql_demo;
+---------+-------+------+--------+--------+
| Ordinal | Host | Port | Role | Locked |
+---------+-------+------+--------+--------+
| 0 | node1 | 3308 | Master | 0 |
| 1 | node1 | 3309 | Master | 0 |
| 2 | node1 | 3308 | Master | 0 |
| 3 | node1 | 3309 | Master | 0 |
| 0 | node1 | 3310 | Master | 0 |
| 1 | node1 | 3311 | Master | 0 |
| 2 | node1 | 3310 | Master | 0 |
| 3 | node1 | 3311 | Master | 0 |
+---------+-------+------+--------+--------+
8 rows in set (0.01 sec)
REBALANCE ALL DATABASES;
After running REBALANCE ALL DATABASES
, notice that both Master and replica partitions exist within the database cluster.
SHOW PARTITIONS ON memsql_demo;
+---------+-------+------+---------+--------+
| Ordinal | Host | Port | Role | Locked |
+---------+-------+------+---------+--------+
| 0 | node1 | 3308 | Replica | 0 |
| 1 | node1 | 3309 | Replica | 0 |
| 2 | node1 | 3308 | Master | 0 |
| 3 | node1 | 3309 | Master | 0 |
| 0 | node1 | 3310 | Master | 0 |
| 1 | node1 | 3311 | Master | 0 |
| 2 | node1 | 3310 | Replica | 0 |
| 3 | node1 | 3311 | Repilca | 0 |
+---------+-------+------+---------+--------+
8 rows in set (0.01 sec)
Enabling High Availability in load_ balanced Mode
load_
mode places a leaf’s replica partitions evenly across the healthy leaves in the cluster.
If a leaf L fails, L’s replica partitions on the healthy leaves in the opposite availability group are promoted to master partitions.
Note
If the leaf_
variable is set to the load_
mode, your cluster is not restricted to an even number of leaves.
Enabling HA on existing leaf nodes
You can enable high availability in a SingleStoreDB cluster using the sdb-admin enable-high-availability
command.
sdb-admin enable-high-availability [--password <secure-password>]
If you need to add more nodes to the cluster, we recommend doing it before enabling High Availability.
Creating a new database
After setting up the cluster, change the leaf_
mode to load_
.
SET GLOBAL leaf_failover_fanout = 'load_balanced';CREATE DATABASE db;
Using an existing database in load_ balanced
mode
To use existing databases in load_
mode, you’ll need to rebalance the database after changing the leaf_
mode.
SET GLOBAL leaf_failover_fanout = 'load_balanced';REBALANCE PARTITIONS ON db;
You need to rebalance all the databases in the cluster.
For large databases, the rebalance operation can be time consuming as the partitions are moved between different hosts.
Enabling HA for a cluster with new leaf nodes
When you add new nodes to a cluster that does not have high availability already enabled, you must perform the same steps as listed in paired mode section to change your cluster configuration and also configure the new leaf nodes appropriately.
Last modified: April 26, 2023