Smart Disaster Recovery (DR): SmartDR
On this page
Note
This is a preview feature supported on AWS, GCP and Azure.
Smart DR creates and manages a continuous asynchronous replication of data between a primary and a geographically separate secondary region.
You can initiate Smart DR via the SingleStore Portal or API.
Smart DR replicates the exact topology from the primary to the secondary region and maintains all the users, permissions, and workspace configurations across the regions.
Benefits
The principal benefits of Smart DR are minimal ongoing costs and a low Recovery Point Objective (RPO of up to 10 minutes).
Use Case
A primary use case for Smart DR is to guarantee business continuity in the face of a region outage.
Depending on your business requirements, it may be essential to have both Multi-AZ High Availability (HA) and Smart DR.
In conjunction with Smart DR, Point-in-Time Recovery (PITR), gives you the ability to go back in time and recover data in both the primary and secondary regions.
Setting up Smart DR
Configuring Replication
-
Go to Workspaces from the left nav and select the workspace.
-
Click the three vertical dots against the selected workspace, and from the drop down list select Configure SmartDR.
-
Click +Configure Replication, choose the relevant database options and submit.
The Configure Smart DR screen section displays 3 top level menu options:
-
Workspaces displays the workspaces name, region and the Failover Role.
-
Databases displays the replicated databases name and status.
Clicking on the Manage button allows you to view the available databases and select them for replication. -
Settings displays the Primary Region which is the region where your database(s) currently reside.
and the Secondary Region to which you want to replicate the database(s). Replication Type which is Storage only by default.
This implies the data is copied asynchronously between clusters and the secondary site does not require active compute nodes running. Auto-replication is disabled by default.
If enabled, new databases are replicated automatically.
Pre-provisioning
Pre-provisioning can be used to configure compute resources in a secondary region in advance.
This enables you to:
-
Configure the private endpoints in the secondary region, before failover is initiated.
-
Test DR by failing over to the secondary region without disrupting your production environment.
-
Failover is faster because compute is already running.
To pre-provision:
-
In the Portal, navigate to the
Replicationtab. -
Click on
Enable Pre-Provisioning.
This starts the background process to configure the secondary region with the same topology of workspaces as your primary region.
To validate your failover capabilities and ensure business continuity, you can switch to the secondary region, attach your database to a workspace, and start querying the data from the secondary region.
You can attach your application in the secondary region and verify that it can insert or update your database as expected, without impacting your production environment.
This seamless testing is possible because when you attach your database to a workspace in the secondary region SingleStore Helios automatically creates a branch of your database.
Failover
Once the database(s) replication is set up and synced up, you can fail over your application to the secondary region anytime there is a regional failure.
To start the process:
-
Click the 3 vertical dots against the selected workspace, and from the drop down list select Configure SmartDR.
-
Click the Failover button on the left upper side of the Configure Smart DR screen.
Check the I confirm… checkbox and click Confirm in the popup window.
During the failover deployment, the system automatically performs the following tasks in the background:
-
Provisions the environment in the secondary region, maintaining the primary region's topology.
-
Provisions and configures all your workspaces.
-
Attaches the databases to the workspace and provides a connection string.
-
Preserves user permissions, pipelines, firewall settings and other metadata from your primary region.
The primary region workspace is automatically suspended as part of the failover task which cannot be resumed or terminated.
System-Managed Databases
After initiating the failover process, you may notice either through the UI or by running the SQL command SHOW DATABASES, there are two databases: one attached to the workspace and another with the same name but including a timestamp, in a detached state.
The detached database, referred to as a system-managed database, is a continuation of your primary region's database.
During failover, SingleStore attaches a branch of this system-managed database to your workspace.
You can access the data in the system-managed database at any time by attaching it as a branch and recovering the missing rows.
Failback
To initiate failback from the secondary region to the primary region:
-
Click the three vertical dots against the selected workspace, and in the drop down list select Configure SmartDR.
-
Click the Failback button on the left upper side of the Configure Smart DR screen.
Check the I confirm… checkbox and click Confirm in the popup window.
The system automatically performs the following tasks during failback:
-
Configures the primary region environment.
-
Attaches replicated databases to the workspace and provides the connection string.
-
Updates user permissions and other metadata with changes from the secondary region.
Upon successful completion, the primary region becomes active, and the secondary region is no longer accessible.
Last modified: