Local and Unlimited Database Storage Concepts

You can store data locally in a SingleStore cluster or remotely in an object store. When you create a database, you specify where to store the data. Unlimited storage databases are available only when you deploy on one of the supported cloud platforms namely, AWS S3, Azure Storage and Google Cloud Storage.

Local Storage Databases

Local storage databases are stored in your SingleStore cluster. To create a local storage database, use the CREATE DATABASE command.

To backup or restore a local storage database, use the BACKUP DATABASE and RESTORE DATABASE commands, respectively.

Unlimited Storage Databases

Note

This feature is not available in all editions of SingleStore. For more information, see SingleStore Editions.

Unlimited storage databases (also known as "bottomless" databases) are kept on remote object storage. The use of remote storage separates where data is stored (in an object store external to the SingleStore cluster) from where the data is processed (in a SingleStore cluster). Because unlimited storage databases are stored remotely, their size is not limited by the size of local cluster storage, but rather only by available external object storage. On public cloud object stores, this is for all practical purposes unlimited.

The cache size is set by the maximum_blob_cache_size_mb variable.

The blobs for high availability (HA) are kept in the blob cache. The size configured for the maximum_blob_cache_size_mb includes the local storage used for HA. This is done so that on any failover the blob cache is at least partially warmed up and ready to run queries. The hottest 40% of blobs on master partitions are stored on replica partitions in a steady state.

For unlimited storage databases, SingleStore caches data from remote storage on local disks or SSDs. The cache item replacement policy is a modified LRU strategy called LRU2, wherein, one large scan does not flush the whole cache if the scanned data is bigger than the cache. LRU2 is a special case of the well-known LRU-K strategy and is commonly used in the DB industry. Information is retained to indicate if objects are frequently-accessed. This reduces the chance that a single large query will flush frequently accessed data from the cache.

The views mv_bottomless_status_extendedmv_cached_blobs and mv_blob_cache_events can be used to monitor the cache behavior.

In an unlimited storage database:

  • All columnstore data is stored externally in an object store. Recently accessed columnstore data objects are also cached locally on the cluster's local storage.

  • Columnstore data can be committed locally, so you do not have to wait for it to get uploaded, which helps keep the latency as low as possible.

  • All rowstore data is stored locally in cluster memory and externally in an object store.

  • Data updates made on the cluster are flushed to the object store asynchronously. Typically, the object store will be no more than one minute behind the latest update on the cluster. Data is imported into an on-disk rowstore, and then asynchronously exported into unlimited storage on the cloud. This is the key to the high performance as data is ingested into the rowstore, "cold" data is put into long term online storage, keeping the "hot" data on disk where it is faster to access.

  • Data is compressed (lz4) upon insertion into remote storage and decompressed upon retrieval. This seamless process can help reduce network bandwidth and storage costs. The compression ratio is highly data-dependent. For rowstore data, it can be as much as 20x. For columnstores, the data is already compressed, but lz4 can compress it further, often in the range of 20% to up to 2x.

  • Data is always stored in an object store but may only have a subset in local storage in your cluster. Querying data that is in object store but not in local storage is comparatively slower the first time as it has to bring the data into the local storage.  Hence it is useful to know how much data and what data is in the cache. 

  • When you resume a cluster the data does not automatically fault back into the local storage. You will have to run queries to "warm the cache" on a resume.

Note

You cannot use the REPLICATE DATABASE command to replicate an unlimited storage database. For an alternative, see Replicate an Unlimited Storage Database.

Disk Cache with Unlimited Storage

To ensure consistent query performance when querying through a hash index, irrespective of when the query was last accessed, the OPTIMIZE TABLE ... WARM BLOB CACHE  command can be used to explicitly warm the disk cache with the index blobs for specific tables. Warming the cache means that the blobs that make up the index will be fetched from remote object storage into the local disk cache, so future accesses can be faster.

Local Disk Space Recommendations for Unlimited Databases

There’s no one-size-fits-all recommendation because every application has different access and locality-of-reference patterns. For the best performance, provide enough local disk to hold the entire database. In addition to allowing you to have more data than will fit locally on the cluster, the remote object storage is there to provide additional benefits including continuous backup, history tracking for PITR, and so on. Providing enough disk to hold the whole database locally is especially important if all parts of the database have relatively similar access frequency, or if you do a lot of random seeking into the database without a lot of locality of reference.

If you have a very large database and you know that you have a high locality of reference, then having the local storage just, say, 20% larger than your working set size can be a good choice. For example, if you have a 10TB database (of actual compressed data) that has 3 years of data, but you almost never touch data older than one year, then having 3.33TB of local disk plus 20% more as buffer, or about 4TB of local storage, should be adequate.

To monitor the rate of blob cache fetch activity, you can query information about fetch events in the mv_blob_cache_events information schema table.

You can also query mv_bottomless_status_extended.blob_cache_download_mb_per_sec which tells you how much data is being downloaded from the blob cache per second. If these seem high given the size of your database and scale of your hardware, consider increasing the local cache size.

Use the following query to find the total number of GB for an entire database:

SELECT SUM(mv_columnstore_files.size)/(1024*1024*1024) AS dataSizeGB
FROM mv_columnstore_files
WHERE mv_columnstore_files.database_name = 'database_name';

Use the following query to find the amount of data in GB that is cached in a local disk in a cluster:

SELECT COUNT(*) AS numSegments,SUM(mv_cached_blobs.size)/(1024*1024*1024) AS blobdataSizeGB
FROM mv_cached_blobs
WHERE mv_cached_blobs.database_name = 's2_dataset_tpch'
AND type = 'primary';

You have to run the above two queries against the information_schema database.  These queries are per database but you can roll it up by cluster if you remove the database name filters.  Also, these files are after compression (not raw data).

Point-in-Time Recovery (PITR)

Note

This feature is not available in all editions of SingleStore. For more information, see SingleStore Editions.

PITR is a user-initiated operation that restores a unlimited storage database to a point in time. It is purely a recovery operation and does not require backups. It uses the unlimited storage feature and leverages the blobs stored in object store. It uses all data for the restore point that has been automatically flushed to the object store.

The PITR window or timeline is the length of the retention period. All points in that timeline are accessible and you can do PITR back and forth within the timeline. For example, today at 9 AM, you can restore back to a point in time two days ago and then restore (roll) forward all the data as it was today at 9 AM, provided all these restore points are within the retention period timeline.

For unlimited storage databases in SingleStore, garbage collection cannot occur at a given version until all partitions can recover from that version. In this case, garbage collection refers to data that is deleted from the remote object store beyond the retention period.

For SingleStore, you control the PITR timeline by setting the bottomless_gc_retention_period_minutes variable.

When you invoke PITR, you can specify either a timestamp or a milestone. Creating milestones is a manual operation that you must do beforehand but they are not required to use PITR.  You can use the timestamp option instead.  Milestones are useful for cases where you do a risky operation, like an upgrade or schema change, and you want to bookmark the point in time right before you do the operation in case you have to roll it back and you do not want to guess what the timestamp needs to be.

Attaching (restoring) an unlimited storage database can be faster than restoring an equivalent local storage database. This is because an attach of an unlimited storage database does not copy all data to the cluster, as is the case with the restore of a local storage database. Note that after an unlimited storage database is attached, queries may be slower for some time until remote data is cached locally in the cluster.

Prior to working with PITR, the enable_bottomless engine variable must be set to ON (the default value).

To work with PITR, use the following commands:

For creating milestones and restoring a database to a milestone, see Attach an Unlimited Storage Database Using Point-in-Time Recovery (PITR).

Last modified: April 24, 2024

Was this article helpful?