Changing Leaf Nodes Pairing

The following example illustrates how to switch the pairing of leaf nodes in an HA environment. The assumption here is that you have spun up an environment with HA and several leaf nodes.

Step 1 - Plan for which leaf nodes you want to change the pairing

Execute the below query:

SELECT * FROM leaves ORDER BY AVAILABILITY_GROUP;
+------------+------+--------------------+------------+-----------+--------+--------------------+---------------------------+---------+
| HOST       | PORT | AVAILABILITY_GROUP | PAIR_HOST  | PAIR_PORT | STATE  | OPENED_CONNECTIONS | AVERAGE_ROUNDTRIP_LATENCY | NODE_ID |
+------------+------+--------------------+------------+-----------+--------+--------------------+---------------------------+---------+
| 10.0.0.86  | 3307 |                  1 | 10.0.0.12  |      3307 | online |                 20 |                     0.306 |       8 |
| 10.0.1.209 | 3307 |                  1 | 10.0.0.177 |      3307 | online |                  1 |                     0.308 |       9 |
| 10.0.0.12  | 3307 |                  2 | 10.0.0.86  |      3307 | online |                 19 |                     0.245 |      10 |
| 10.0.0.177 | 3307 |                  2 | 10.0.1.209 |      3307 | online |                  2 |                     0.246 |      11 |
+------------+------+--------------------+------------+-----------+--------+--------------------+---------------------------+---------+

You decide to change the pairing of the leaf nodes:

  • 10.0.0.12 on Availability_Group 2 Paired_Host 10.0.0.86 Availability_Group 1  10.0.0.177 on Availability_Group 2 Paired_Host 10.0.1.209 Availability_Group 1

to

  • 10.0.0.12 on Availability_Group 2 Paired_Host 10.0.0.177 on Availability_Group 1 10.0.0.86 on Availability_Group 2 Paired_Host 10.0.1.209 on Availability_Group 1

Step 2 - Remove and Re-add the Leaf Nodes

Caution

Changing the pairing of nodes will create double the data due to orphan databases left behind from the previously paired nodes. This data will not be removed until the orphan databases are cleared as explained in the last step.

Enable manual cluster control by setting the auto_attach global variable to off.

Next remove the leaf node on 10.0.0.86. This will force a failover to its paired host 10.0.0.12.

REMOVE LEAF '10.0.0.86':3307 FORCE; 

Now remove the leaf on 10.0.0.177. This will force a failover to its paired host on 10.0.1.209.

REMOVE LEAF '10.0.0.177':3307 FORCE; 

Add the leaf on 10.0.1.177 into availability group 1. This will cause it to pair with an available node from availability group 2. The only node available is the one on host 10.0.0.12, so the pairing will happen with this node as required.

ADD LEAF root:'1xGQHGC6iJnzZvHS'@'10.0.0.177':3307 INTO GROUP 1;

The string after root is the password (as applicable).

Add the final leaf on 10.0.0.86 into availability group 2 so that it will pair with an available host on availability group 1, that being 10.0.1.209 as you can see in the above output.

ADD LEAF root:'1xGQHGC6iJnzZvHS'@'10.0.0.86':3307 INTO GROUP 2;

Step 3 - Verify the Changes

Lastly, you can verify all changes are correct using SHOW LEAVES.

SHOW LEAVES;
+------------+------+--------------------+------------+-----------+--------+--------------------+------------------------------+--------+
| Host       | Port | Availability_Group | Pair_Host  | Pair_Port | State  | Opened_Connections | Average_Roundtrip_Latency_ms | NodeId |
+------------+------+--------------------+------------+-----------+--------+--------------------+------------------------------+--------+
| 10.0.1.209 | 3307 |                  1 | 10.0.0.86  |      3307 | online |                  2 |                        0.307 |      9 |
| 10.0.0.12  | 3307 |                  2 | 10.0.0.177 |      3307 | online |                 17 |                        0.276 |     10 |
| 10.0.0.177 | 3307 |                  1 | 10.0.0.12  |      3307 | online |                  2 |                        0.227 |     12 |
| 10.0.0.86  | 3307 |                  2 | 10.0.1.209 |      3307 | online |                  2 |                        0.324 |     13 |
+------------+------+--------------------+------------+-----------+--------+--------------------+------------------------------+--------+

If everything is correct, set the auto_attach global variable to ON.

Step 4 - Clear Orphan Databases

Check to see if there are any orphan databases (which there should be). These are database partitions the cluster doesn't identify as either a master or a replica.

EXPLAIN CLEAR ORPHAN DATABASES;
+--------+----------------+-------------+-------------+
| Action | Database       | Target_Host | Target_Port |
+--------+----------------+-------------+-------------+
| DROP   | memsql_demo_0  | 10.0.0.86   |        3307 |
| DROP   | memsql_demo_1  | 10.0.0.86   |        3307 |
| DROP   | memsql_demo_10 | 10.0.0.86   |        3307 |
| DROP   | memsql_demo_11 | 10.0.0.86   |        3307 |
| DROP   | memsql_demo_12 | 10.0.0.86   |        3307 |
| DROP   | memsql_demo_13 | 10.0.0.86   |        3307 |
| DROP   | memsql_demo_14 | 10.0.0.86   |        3307 |
| DROP   | memsql_demo_15 | 10.0.0.86   |        3307 |
| DROP   | memsql_demo_2  | 10.0.0.86   |        3307 |
| DROP   | memsql_demo_3  | 10.0.0.86   |        3307 |
| DROP   | memsql_demo_4  | 10.0.0.86   |        3307 |
| DROP   | memsql_demo_5  | 10.0.0.86   |        3307 |
| DROP   | memsql_demo_6  | 10.0.0.86   |        3307 |
| DROP   | memsql_demo_7  | 10.0.0.86   |        3307 |
| DROP   | memsql_demo_8  | 10.0.0.86   |        3307 |
| DROP   | memsql_demo_9  | 10.0.0.86   |        3307 |
+--------+----------------+-------------+-------------+

Verify that each of the database partitions listed has a master partition by viewing the cluster status or grepping for the specific partition IDs.

SHOW CLUSTER STATUS;

or

$ memsql -h <host> -P <port> -e "show cluster status" | grep <partition id>

Once you have done the verification for all database partitions, run:

CLEAR ORPHAN DATABASES;

Step 5 - Rebalance Partitions

Finally, run the following command on the master aggregator to verify that all nodes have evenly distributed data and redundancy:

EXPLAIN REBALANCE ALL DATABASES;

Based on the output after running this command, follow up by rebalancing partitions using this command:

REBALANCE ALL DATABASES;