Replication and Durability Concepts

Replication ensures redundancy in a cluster. There are two types of replication:

High Availability - replicating partitions between leaf nodes
Cluster Replication - replicating partitions between clusters

High availability pairs leaves, and copies all partitions between them. Each leaf consists of half primary partitions which ingest the data and respond to queries, and half replica partitions, which replicate the primary partitions from the paired leaf. When the Master Aggregator detects that a leaf has failed, it promotes the replica partitions on the paired leaf so that all of its partitions are in the primary state, ingesting data and responding to queries. This can take a few seconds to a few minutes to detect the failure and promote partitions but is ultimately much faster than it is for you to detect an issue with the node and fix it. This ensures the cluster remains online, even if a few leaves fail. However if both leaves in a paired set fail, there will be downtime. You should plan the memory and disk allocation carefully for each leaf, since its partition count and data size will be doubled when high availability is enabled.

Cluster replication involves replicating a database from one cluster to another cluster. The primary database will behave normally. The secondary database replicates data from the primary database and is a read-only database, which is useful for running expensive analytical queries so they do not impact your workload on the primary database. If there is an issue with the first cluster (for example an AWS region fails) then you can stop replication on the secondary cluster, and direct an application to it instead. This promotes the secondary database to primary, allowing you to use it as read/write like in a regular database. After stopping replication you cannot start it up again without first dropping the database that was originally replicated. However, you could start replication in the opposite direction at that point. The clusters need not have the same count of nodes but they will both be storing the same amount of data so ensure there are sufficient host resources on each cluster.

An example of Cluster Replication usage is to pair clusters the way leaves are paired in High Availability, as described below. This ensures the workload is split and you only have to stop replication on one database if one cluster fails. Cluster A has database db_a Cluster A replicates database db_b from Cluster B Cluster B has database db_b Cluster B replicates database db_a from Cluster A

Replication and Durability Concepts

In this section

Was this article helpful?

Was this article helpful?