Benefits of Unlimited Storage
A major benefit of unlimited storage is that it provides elastic scalability for storage in SingleStore self-hosted and cloud clusters.
Even if you do not need the additional storage, this feature has several other benefits, including:
It enables Point in time recovery (PITR) in editions that support PITR.
It can improve performance when rebalancing.
If you do rebalancing in regular storage it requires moving the data synchronously. But with unlimited storage, you only move the metadata (describing names and existence of files, not their contents) synchronously when rebalancing. Expanding your cluster is a size-of-metadata, not a size-of-data operation with unlimited storage. This is valuable if you have a large amount of data per leaf.
It helps in getting more usable local disk space since you don't need to store the full high-availability (HA) copy locally.
You need HA row replicas for durability but you only need that for recently-changed data. It allows you to have more usable local disk space, have durability on new writes, and then lazily populate HA replica data from object storage to other nodes if a leaf fails.
It enables excellent performance when your working set fits in the local disk.
Working set size is totally application-dependent. There is no precise definition of a working set, but a good working definition is: the smallest subset of data such that your workload is not more than 10% slower than if all the data is in the cache. Depending on your application, your working set might be 100% of the data, or it might be only 30% of the data. If your working set does not fit in cache then as in many computing systems, the performance will not be good. For example, if the working set does not fit in the RAM in a virtual memory system, it will thrash. Having large local storage device sizes that allow you to have your working set in local storage is the recommended solution to overcome this. Once the working set is stored on local storage, you can expect excellent performance.
It gives additional fault tolerance (a form of continuous online backup), in that if you lose a cluster altogether, the database data still exists on object storage, current up to the last two minutes or so of data.
This does not protect against all faults, for example, dropping a database accidentally will drop the data. So some backups may still be needed, but potentially fewer. This also can save on backup storage space. Moreover, because PITR is available, you may no longer need to do incremental backups, saving time, effort, and space.
Last modified: February 9, 2024