Database Storage

SingleStore Helios stores data remotely in an object store (unlimited storage database) or locally in a workspace, depending on the cloud provider you have chosen for the workspace deployment. AWS, Azure, and GCP workspaces store data remotely in an object store. When you deploy a workspace using one of these cloud providers and create databases, the databases will automatically use unlimited storage for the workspace.

Unlimited Storage Databases

SingleStore automatically manages data across a three tiered storage architecture comprised of memory, local cache, and storage. When deploying SingleStore all of the storage tiers are automatically configured. This is referred to as an “Unlimited Storage” database.

Unlimited Storage allows data loaded into SingleStore to be seamlessly moved between memory, persistent cache, and storage. The storage tier leverages the native object storage on AWS, Azure, or GCP and is transparent to the user and fully managed by SingleStore. Since unlimited storage databases are stored in object storage, their size is not limited by the size of the persistent cache, but only by available external object storage.

Although the size of a database is effectively unlimited on public cloud objects stores, the following factors related to the local storage space limit the number of databases and tables you can create.

  • To avoid performance problems, there should be sufficient local storage space to hold the working set of all your databases, combined.

  • There is a per-database local storage (even if the database is empty) and memory overhead, and a per-table memory overhead for metadata. The amount of local storage space required per partition varies depending on the size of the blob cache which can grow and shrink based on the workload.

    In an unlimited storage database:

  • All columnstore data is stored externally in an object store. Recently accessed columnstore data objects are also cached locally in the compute workspace's persistent cache.

  • All rowstore data is stored locally in compute workspace memory and externally in object storage.

  • Data updates made on the workspace are flushed to the object store asynchronously. Typically, the object store will be no more than one minute behind the latest update on the workspace.

When you run the CREATE DATABASE command on AWS, Azure, and GCP workspaces, SingleStore automatically creates an unlimited storage database. BACKUP DATABASE and RESTORE DATABASE commands can be used for database backup and restore respectively. When RESTORE DATABASE is run, data will be automatically restored into an unlimited storage database.

Unlimited Storage Tiers

Memory: SingleStore stores data in memory when using Rowstore, when caching data for Columnstore, and for operations which utilize the high performance characteristics of system memory.

Persistent Cache: This tier is comprised of high performance block storage and serves Columnstore data and persists Rowstore data. For optimal performance a SingleStore deployment should be sized so that the working dataset (data to be queried) fits within the Persistent Cache.

Storage: This is a durable and persistent layer stored within the cloud object storage. Data is regularly pushed to object storage providing a cool tier of data, and allowing for long-term retention beyond the lifetime of a deployment. It also serves to enable features such as Point-in-time-Recovery. The amount of data persisted in the storage tier may exceed the size of the active databases due to the retention of snapshots to be used for data recovery.

Point-in-Time Recovery (PITR)

Note

This feature is not available in all editions of SingleStore. For more information, see SingleStore Helios Editions.

Note

For the Standard edition use the SYNC BOTTOMLESS DATABASE command to sync data to bottomless before doing a major online operation like resizing a cluster or upgrading an application. This guarantees you can recover to at least the time the sync was performed.

For the Premium edition, you can use either the CREATE MILESTONE with PITR or the SYNC BOTTOMLESS DATABASE. command.

PITR is a user-initiated operation that restores a unlimited storage database to a point in time. It is purely a recovery operation and does not require backups. It uses the unlimited storage feature and leverages the blobs stored in object store. It uses all data for the restore point that has been automatically flushed to the object store.

The PITR window or timeline is the length of the retention period. All points in that timeline are accessible and you can do PITR back and forth within the timeline. For example, today at 9 AM, you can restore back to a point in time two days ago and then restore (roll) forward all the data as it was today at 9 AM, provided all these restore points are within the retention period timeline.

When you invoke PITR, you can specify either a timestamp or a milestone. Creating milestones is a manual operation that you must do beforehand but they are not required to use PITR.  You can use the timestamp option instead.  Milestones are useful for cases where you do a risky operation, such as an upgrade or schema change, and you want to bookmark the point in time right before you do the operation, in case you have to roll it back and you do not want to guess what the timestamp needs to be.

Attaching (restoring) an unlimited storage database can be faster than restoring an equivalent local storage database. This is because an attach of an unlimited storage database does not copy all data to the workspace, as is the case with the restore of a local storage database. Note that after an unlimited storage database is attached, queries may be slower for some time until remote data is cached locally in the workspace.

To work with PITR, use the following commands:

For a basic walkthrough of creating milestones and restoring a database to milestone, see Attach an Unlimited Storage Database Using Point-in-Time Recovery (PITR).

A database can be restored to any point in time via the Cloud Portal or the ATTACH DATABASE command. PITR functionality is available in the Premium and Dedicated editions. To access this functionality, select Workspaces and choose your workspace. Then, select Databases and choose your database. Finally, navigate to the Recovery tab.

See Disaster Recovery for data retention and other related information.

In this section

Last modified: February 28, 2024

Was this article helpful?