Calculating Memory Allocation per Host in a Cluster

In disaster recovery (DR) the secondary cluster should have slightly more memory than the primary. While fewer user queries may be run on the secondary cluster, it will still need to accommodate clustering operations like  REBALANCE PARTITIONS , as well as large deletes. Deleting rows from a table temporarily uses more memory since rows are first marked for deletion, then deleted by the background garbage collector.

Allocate enough memory per leaf and per aggregator in the secondary cluster to store the same amount of data as the primary. Rowstore tables will use more memory, and columnstore tables will use more disk. You should calculate all memory currently used for the nodes in the primary cluster, then divide that by the count of nodes in the secondary cluster.

The steps are as follows:

1. Calculate memory usage per node

avg_aggregator_usage = sum(memory used for aggregators in primary cluster) / count of aggregators in secondary cluster

avg_leaf_usage = sum(memory use for leaves in primary cluster) / count of leaves in secondary cluster

2. Total the memory needs per host

If there is one aggregator and one leaf, then allocate avg_aggregator_usage + avg_leaf_usage.

If there are two leaves then allocate 2 * avg_leaf_usage.

3. Calculate memory usage in the primary cluster

You can find the total memory usage from the SHOW STATUS EXTENDED command in the client, and grep for Total_server_memory which indicates the usage.

Run this command to see the current total memory usage for all nodes:

SELECT * FROM INFORMATION_SCHEMA.MV_GLOBAL_STATUS WHERE VARIABLE_NAME = 'Total_server_memory';

Assume it returns the result as:

Memory use of nodes: 10.0.3.202:3306
Total_server_memory 464.0 MB
Memory use of nodes: 10.0.3.101:3306
Total_server_memory 460.0 MB
Memory use of nodes: 10.0.3.303:3306
Total_server_memory 462.5 MB
Memory use of nodes: 10.0.3.202:3307
Total_server_memory 1459.0 MB
Memory use of nodes: 10.0.3.101:3307
Total_server_memory 1458.9 MB
Memory use of nodes: 10.0.3.303:3307
Total_server_memory 1475.0 MB
Memory use of nodes: 10.0.3.404:3306
Total_server_memory 1435.0 MB
Memory use of nodes: 10.0.3.404:3307
Total_server_memory 1448.9 MB

4. Sum by role

Next total the memory consumption by role. In the example above, the first three lines are for the master aggregator and two child aggregators. The rest are leaves.

Aggregator memory consumption: 464 + 460 + 463 = 1387

Leaf memory consumption: 1459 + 1459 + 1475 + 1435 + 1448 = 7277

5. Divide by new role count

Finally, divide this number by the count of each role in the secondary cluster.

If you were to reduce to only one master aggregator (MA), one child aggregator (CA), and four leaves then:

Aggregators: 1387 / 2 = 694

Leaf nodes: 7277 /4 = 1820

6. Calculate node usage per host

If each host is going to have two nodes then the minimum allocations to support current usage would be:

MA + leaf: 694 + 1820 = 2514

CA + leaf: 694 + 1820 = 2514

2 * leaf: 1820 + 1820 = 3640

In this example, these are the minimum memory allocations per leaf in the secondary cluster to support current usage. You are likely to need more room for data set growth, as well as to accommodate query overhead and unexpected spikes in usage.

Last modified: April 26, 2023

Was this article helpful?