Managing High Availability

High Availability Architecture

SingleStore stores data redundantly in a set of leaves, called availability groups, to ensure high availability. The number of availability groups can be set via the redundancy_level variable. SingleStore exposes two modes of operation: redundancy-1 and redundancy-2 (refer to Administering a Cluster). In redundancy-1, there are no extra online copies of your data (all partitions are masters). In the event of a leaf node failure, your data is offline until you reintroduce the leaf back into the system. If your leaf node is irrecoverably lost, you can use the REBALANCE PARTITIONS … FORCE command to create empty replacement partitions instead of recovering your data. In redundancy-2, SingleStore handles node failures by promoting the appropriate replica partitions into masters so that your databases remain online. Of course, if all of your machines fail, then your data is unavailable until you recover enough machines or recreate the cluster from scratch. Each redundancy mode has an expected, balanced state.

When you recover a leaf in redundancy-1, by default it will be reintroduced to the cluster, with all the data that was on it restored.

Similarly, when you recover a leaf in redundancy-2, by default it will be reintroduced to the cluster. For each partition on that leaf one of the following things will happen:

  • If that is the only instance of that partition (in other words, if the pair of this leaf is also down), the partition will be reintroduced to the cluster.

  • If there is another instance of that partition on the pair of the leaf, the newly recovered instance will become a replica to the existing partition instance. If there is any data divergence between the two instances, the partition instance on the newly recovered leaf will be discarded, and a new replica partition instance will be introduced, with data replicated from the existing copy.

  • If there is another instance of that partition, but it is on a leaf that is not the pair of the recovered leaf, then the recovered partition instance will be marked as an orphan.

You can disable the master aggregator from automatically attaching leaves that become visible by setting a global variable auto_attach to Off. In this case you will need to manually run ATTACH LEAF to move that leaf into the online state once it is available. See Attaching - Examples below to see both automatic and manual approaches in action.

Note

Every high availability command in SingleStore is online. As the command runs, you can continue to read and write to your data.

High availability commands can only be run on the Master Aggregator.

In this section

Last modified: November 8, 2024

Was this article helpful?