Managing Columnstore Segments

A columnstore table will have the best performance if the rows in the table are in global sorted order across all the row segments. In reality, maintaining such an order is not feasible in the presence of continuous writes.

Background Merger

SingleStore uses an advanced algorithm that allows it to maintain the order of columnstore tables as close to sorted as possible, while data is being ingested or updated. This process is called a background merger and runs constantly in the background.

The background merger runs in an optimistic manner. If at any point the background merger  tries to move data which is being changed by a concurrent UPDATE or DELETE query, it discards all the work it has done so far and starts over. The background merger  works on a small chunk of data at a time, so if a conflict with an UPDATE or DELETE occurs, a relatively small amount of work is discarded. However, in the presence of a very heavy update workload, the slowdown  can be significant compared to a pessimistic merger, which locks the row segments it is currently processing. A user can manually trigger a pessimistic merger by running the OPTIMIZE TABLE command. The following sections explain how to decide whether running an OPTIMIZE TABLE command is necessary, and how to run it.

Sorted Row Segment Group

SingleStore uses the concept of a sorted row segment group to describe a set of row segments that are sorted together on the columnstore key columns. Row segments form a sorted row segment group if and only if there is an order on the row segments such that for each row segment the smallest row in it is no smaller than the largest row in any row segment before it. Here and below when we say that one row is smaller than another row, we mean that the values of columns of the SORT KEY of that row are smaller than those of the other row.

Within a sorted row segment group there are no row segments with overlapping value ranges for the sort key column(s). Refer to How the Columnstore Works for more information.

A table may have multiple sorted row segment groups. The goal of the background merger is to reorganize the rows among row segments in such a way that the number of sorted row segment groups is as small as possible.

Note

KEY() USING CLUSTERED COLUMNSTORE is a legacy syntax that is equivalent to SORT KEY(). SingleStore recommends using SORT KEY().

To inspect the current state of the sorted row segment groups of a table, named sales, run the SHOW COLUMNAR MERGE STATUS FOR <table_name> command:

SHOW COLUMNAR MERGE STATUS FOR sales;
+----------------------------------------------------------------------------+
| Merger           | State | Plan                    | Progress | Partition  |
+----------------------------------------------------------------------------+
| (Current sales)  | NULL  | 741,16,1                | NULL     |   0        |
| (Current sales)  | NULL  | 782,20                  | NULL     |   1        |
| (Current sales)  | NULL  | 701,40,5                | NULL     |   2        |
| (Current sales)  | NULL  | 326,207,123,37,21,19,17 | NULL     |   3        |
+----------------------------------------------------------------------------+

Observe the first row of the result. According to that row, the slice of the table that is stored on partition 0 has three sorted row segment groups, one consists of 741 row segments, one consists of 16 row segments, and one consists of a single row segment, for a total of 758 row segments. Consider the impact of that number of sorted row segment groups on a simple query like:

SELECT * FROM sales WHERE store_id = 15;

This query filters on the SORT KEY which is store_id in this example. The very first sorted row segment group will have at most one row segment that contains rows with store_id equal to 15, unless rows with store_id = 15 are on the boundary of two row segments, or if there is a large data skew and several row segments consist only of rows with store_id = 15. Similarly, at most one row segment in the second sorted row segment group contains relevant rows, and the only segment of the third sorted row segment group might also contain relevant rows. This way, only three row segments out of the total of 758 will be opened and materialized. While the query in this example is very simple, similar reasoning works for significantly more complex queries.

Optimize Sorted Row Segment Groups

Now take a look at the sorted row segment groups on partition 3, which is significantly less optimized than the other partitions. A SELECT query similar to the one above will materialize eight row segments. If the background merger is enabled, and no workload is running concurrently, the background merger would optimize this partition quickly, likely within several seconds. However, in the presence of a heavy workload, the optimistic background merger might fall behind. In this case, it may be reasonable to manually trigger a pessimistic merger by running the following command:

OPTIMIZE TABLE sales

By running SHOW COLUMNAR MERGE STATUS while OPTIMIZE TABLE is running, you may see the manual merger in action:

SHOW COLUMNAR MERGE STATUS FOR sales;
+--------------------------------------------------------------------------------+
| Merger           | State    | Plan                    | Progress | Partition   |
+--------------------------------------------------------------------------------+
| (Current sales)  | NULL     | 741,16,1                | NULL     |         0   |
| (Current sales)  | NULL     | 782,20                  | NULL     |         1   |
| (Current sales)  | NULL     | 701,40,5                | NULL     |         2   |
| (Current sales)  | NULL     | 326,207,123,37,21,19,17 | NULL     |         3   |
| Manual Merger    | Working  | 326+207+123+37+21+19+17 | 53.12%   |         3   |
+--------------------------------------------------------------------------------+

The last row indicates that there is a manual merger running on partition 3 and that at this time 53.12% of the work is complete.

When the merger is done, the table has fewer sorted row segment groups in partition 3:

SHOW COLUMNAR MERGE STATUS FOR sales;
+------------------------------------------------------------+
| Merger           | State | Plan     | Progress | Partition |
+------------------------------------------------------------+
| (Current sales)  | NULL  | 741,16,1 | NULL     |         0 |
| (Current sales)  | NULL  | 782,20   | NULL     |         1 |
| (Current sales)  | NULL  | 701,40,5 | NULL     |         2 |
| (Current sales)  | NULL  | 730,20   | NULL     |         3 |
+------------------------------------------------------------+

During the optimization process, no partitions were merged into a single sorted row segment group as can be seen in the output shown above. Both optimistic and pessimistic mergers use an advanced algorithm which is optimized to do small chunks of work in the presence of concurrent writes and maintain data in a few sorted row segment groups, instead of attempting to merge all the data into a single sorted row segment group.

In cases when it is acceptable to sacrifice some time on data ingestion to achieve even higher SELECT performance, it is possible to run a manual command that merges data on each partition into a single sorted row segment group. The command OPTIMIZE TABLE <table_name> FULL will merge the data on each partition into a single sorted row segment group, as shown below.

OPTIMIZE TABLE sales FULL;
SHOW COLUMNAR MERGE STATUS FOR groups;
+---------------------------------------------------------+
| Merger           | State | Plan | Progress | Partition  |
+---------------------------------------------------------+
| (Current sales)  | NULL  | 758  | NULL     |         0  |
| (Current sales)  | NULL  | 802  | NULL     |         1  |
| (Current sales)  | NULL  | 746  | NULL     |         2  |
| (Current sales)  | NULL  | 750  | NULL     |         3  |
+---------------------------------------------------------+

When the table is in this state, any highly selective SELECT query will materialize one row segment per partition.

Important

Unlike OPTIMIZE TABLE <table_name>, which takes the amount of time proportional to the size of recently loaded data, OPTIMIZE TABLE <table_name> FULL takes an amount of time proportional to the order of magnitude of the size of the entire table, unless data in that table is already sorted.

Inserts, Deletes, and Garbage Collection

When inserting a small number of rows into the columnstore table, an in-memory rowstore-backed segment is used to store the rows. As this rowstore-backed segment fills, the background flusher periodically flushes these rows to disk. A rowstore-backed segment can be flushed to disk manually by running OPTIMIZE TABLE <table_name> FLUSH.

OPTIMIZE TABLE t FLUSH;

Deletes in columnstores are marked as deleted but otherwise left in place (refer to Writing Columnstore Data). However, when a significant number of rows are deleted from a segment, the background merger rewrites those segments if the number of deleted rows is at least 1/8th of the desired segment size, and the sorted run isn't going to be merged otherwise. This rewrite happens per sorted run so it does not break the order of the run.

MV_COLUMNSTORE_MERGE_STATUS shows the details about the merger processes.

Unneeded columnstore segments stored in memory are eliminated in a process called garbage collection. A buffer manager maintains unused memory blocks for future use until no query or operation is using the memory anymore.

The garbage collection process automatically runs in the background, so it's normally not necessary to execute the TRIGGER GC [FLUSH] command which manually activates the garbage collector. However, if memory-related errors or node failures are being generated, executing TRIGGER GC [FLUSH] might be appropriate. See TRIGGER GC for more information.

Last modified: January 27, 2025

Was this article helpful?