Calculating Memory Allocation per Host in a Cluster
In disaster recovery (DR) the secondary cluster should have slightly more memory than the primary. While fewer user queries may be run on the secondary cluster, it will still need to accommodate clustering operations like REBALANCE PARTITIONS
, as well as large deletes. Deleting rows from a table temporarily uses more memory since rows are first marked for deletion, then deleted by the background garbage collector.
Allocate enough memory per leaf and per aggregator in the secondary cluster to store the same amount of data as the primary. Rowstore tables will use more memory, and columnstore tables will use more disk. You should calculate all memory currently used for the nodes in the primary cluster, then divide that by the count of nodes in the secondary cluster.
The steps are as follows:
1. Calculate memory usage per node
avg_aggregator_usage = sum(memory used for aggregators in primary cluster) / count of aggregators in secondary cluster
avg_leaf_usage = sum(memory use for leaves in primary cluster) / count of leaves in secondary cluster
2. Total the memory needs per host
If there is one aggregator and one leaf, then allocate avg_aggregator_usage + avg_leaf_usage.
If there are two leaves then allocate 2 * avg_leaf_usage.
3. Calculate memory usage in the primary cluster
You can find the total memory usage from the SHOW STATUS EXTENDED
command in the client, and grep
for Total_server_memory
which indicates the usage.
Run this command to see the current total memory usage for all nodes:
SELECT * FROM INFORMATION_SCHEMA.MV_GLOBAL_STATUS WHERE VARIABLE_NAME = 'Total_server_memory';
Assume it returns the result as:
Memory use of nodes: 10.0.3.202:3306 Total_server_memory 464.0 MB Memory use of nodes: 10.0.3.101:3306 Total_server_memory 460.0 MB Memory use of nodes: 10.0.3.303:3306 Total_server_memory 462.5 MB Memory use of nodes: 10.0.3.202:3307 Total_server_memory 1459.0 MB Memory use of nodes: 10.0.3.101:3307 Total_server_memory 1458.9 MB Memory use of nodes: 10.0.3.303:3307 Total_server_memory 1475.0 MB Memory use of nodes: 10.0.3.404:3306 Total_server_memory 1435.0 MB Memory use of nodes: 10.0.3.404:3307 Total_server_memory 1448.9 MB
4. Sum by role
Next total the memory consumption by role. In the example above, the first three lines are for the master aggregator and two child aggregators. The rest are leaves.
Aggregator memory consumption: 464 + 460 + 463 = 1387
Leaf memory consumption: 1459 + 1459 + 1475 + 1435 + 1448 = 7277
5. Divide by new role count
Finally, divide this number by the count of each role in the secondary cluster.
If you were to reduce to only one master aggregator (MA), one child aggregator (CA), and four leaves then:
Aggregators: 1387 / 2 = 694
Leaf nodes: 7277 /4 = 1820
6. Calculate node usage per host
If each host is going to have two nodes then the minimum allocations to support current usage would be:
MA + leaf: 694 + 1820 = 2514
CA + leaf: 694 + 1820 = 2514
2 * leaf: 1820 + 1820 = 3640
In this example, these are the minimum memory allocations per leaf in the secondary cluster to support current usage. You are likely to need more room for data set growth, as well as to accommodate query overhead and unexpected spikes in usage.