Leaf Node Recovery Failed Scenario
Scenario: One of the leaves can not replicate data, its status is
RECOVERY_, and the replica partitions are in an unrecoverable state.
If the node is not able to recover, the most common issue is related to memory configuration but not always.
If it is the case of the node not having enough memory to replay data back into memory then increase
maximum_ to allow recovery to complete.
Try to restart the leaf that is still in the
To investigate the possible causes you can also check the following:
SHOW CLUSTER STATUS
sysctl -afrom the host that has the leaf node with the error.
vm. should be 100000000 on all nodes.
Ensure open files
ulimit is set to >= 1000000 on all nodes.
If using NUMA nodes the total size of the nodes should be less than the physical memory available on the server.
After making any adjustments to the memory settings or the variables you should try to restart the leaf that is still in the
sdb-admin restart-node and then select the appropriate leaf.
Last modified: July 20, 2022