Configuring Disk Spilling

In SingleStore, the local disk space on each leaf node is typically shared among the database logs, snapshots, plan cache, trace log, blob cache, disk spilling etc. Among all these, the blob cache and disk spilling are the two most significant in terms of space usage. While the blob cache in an unlimited storage deployment is configured to use most of the disk space, disk spilling from query execution is the most dynamic and the amount of spillage can vary widely depending on the queries, data volumes and data characteristics, memory configuration and space availability.

SingleStore's disk manager is built to balance the requirements of both disk spilling and the blob cache for optimal disk space usage and performance. Its default heuristic configuration allows disk spilling to temporarily steal a limited amount of space from the blob cache without any noticeable impact on the blob cache performance.  In case of some special requirements for a workload, it allows users to manually set how much space disk spilling can temporarily steal from the blob cache by setting the global variable spilling_maximum_disk_percent.

Benefits

  • Helps prevent query failure due to memory constraints: When the memory load on a node exceeds a certain threshold and in-memory data structures reach a significant size (100 MB in this context), disk spilling enables the continuation of query execution by moving parts of the process to disk.

  • Handles heavy operations: It is useful when processing operations such as GROUP BY, SORT/ORDER BY, window functions, and hash joins, which are prone to use a significant amount of memory.

  • Manages query operations that exceed the maximum memory: It is particularly relevant when operations are at risk of exceeding the maximum memory limits set for nodes. By spilling to disk, it alleviates the memory load, although at the cost of potentially slower query execution.

Disk Spilling Caveats

Some of the operations that spilling does not currently handle are:

  • Right and full outer joins.

  • Complex queries with dependent operations (may need knobs to enable spilling).

  • Window functions without the partition clause.

Users cannot configure where the data processing can be spilt in memory.

SPILLING_MAXIMUM_DISK_PERCENT

The global variable spilling_maximum_disk_percent controls the percentage of disk space that disk spilling can steal from the blob cache during query execution.

Its default value is -1 which uses a heuristic setting when a query spilling needs to steal space from the blob cache.  It allows spilling to steal from the blob cache as long as the blob cache size stays within 70% of the maximum_blob_cache_size_mb variable value. This default configuration is designed to balance the requirements of disk space for both the blob cache and disk spilling, ensuring optimal performance and space utilization under typical scenarios.

It can be set to any value within the range 0 and 1 (both inclusive).

Setting the spilling_maximum_disk_percent = 0 turns off query spilling support. A query that needs to spill will return an error message: "Spilling disk usage (%f MB) exceeds the limit (%f MB) set through global variable 'spilling_maximum_disk_percent'(%f). Spilling is aborted. New request size: %lu bytes".

When spilling_maximum_disk_percent is set in the range > 0 and <=1 queries that need to spill will keep spilling as long as the spilling size is within the limit of (total disk size - minimal_disk_space) * spilling_maximum_disk_percent. This setting may allow spilling to push the blob cache to a much lower size than the typical 70% of maximum_blob_cache_size_mb as long as the disk usage by spilling does not hit any other existing limits like spilling_minimal_disk_space or minimal_disk_space.

Setting spilling_maximum_disk_percent = 1 is not recommended because it allows spilling to use any free disk space until the usable free space hits spilling_minimal_disk_space.  After that, any new queries that need to spill or running queries that need to spill more will return an out-of-disk error.

It is recommended to leave the spilling_maximum_disk_percent at the default value.

memsql> SHOW GLOBAL VARIABLES LIKE 'spilling%disk%';
+-------------------------------------------------------+-----------+
| Variable_name                                         | Value     |
+-------------------------------------------------------+-----------+ 
| spilling_maximum_disk_percent                         | -1.000000 |
+-------------------------------------------------------+-----------+

Displaying Disk Spilling Related Metrics

Use the command, SHOW STATUS [EXTENDED] to display information about disk spilling metrics that exist at the time when this command is run. For example, it shows the current spilling size with 'Disk_spilling_usage_mb'  and how much blob cache space (in MB) is stolen by disk spilling with 'Disk_spilling_evict_blob_cache_mb_average'.

memsql> SHOW STATUS EXTENDED LIKE 'disk%';
+-------------------------------------------+--------+
| Variable_name                             | Value  |
+-------------------------------------------+--------+
| Disk_spilling_usage_limit_mb              | 86608  |
| Disk_spilling_usage_mb                    | 0      |
| Disk_spilling_evict_blob_cache_mb_average | 0      |
| Disk_spilling_usage_mb_high_water_mark    | 0      |
| Disk_total_size_mb                        | 189008 |
| Disk_available_size_mb                    | 49708  |
| Disk_uncategorized_usage_mb               | 139200 |
+-------------------------------------------+--------+
memsql> SHOW STATUS LIKE '%blob%';
+-----------------------------------------------------------+-------+
| Variable_name                                             | Value |
+-----------------------------------------------------------+-------+
| Blob_cache_max_size_mb_adjusted_for_low_disk              | 86608 |
| Blob_cache_max_size_mb_adjusted_for_low_disk_and_spilling | 86608 |
+-----------------------------------------------------------+-------+

As the name indicates, Blob_cache_max_size_mb_adjusted_for_low_disk is the blob cache maximum size in MB after the configured maximum_blob_cache_size_mb is adjusted on low disk condition.  Blob_cache_max_size_mb_adjusted_for_low_disk_and_spilling is the effective maximum blob cache size after taking both the low disk adjustment and spilling impact into account.

For an explanation of each metric refer to SHOW STATUS EXTENDED

Disk Spilling Errors

Some typical error conditions that may arise are:

  • The disk available size is below the value of spilling_minimal_disk_space.

  • The total spilling size exceeds the disk_spilling_usage_limit_mb value.

  • The columnstore blob cache is low during database replication.

To know about the error conditions and messages refer to Disk Spilling Errors

Last modified: February 23, 2024

Was this article helpful?