Configuring the Columnstore to Work Effectively
On this page
Warning
Configuration options described in this section may have significant performance impact on the workload.
Configuring Segment Size in Columnstore Tables
In SingleStore, data in a columnstore table are organized into multiple row segments.
-
In general, larger segments compress better than smaller ones.
This leads to less disk usage and faster scan across the table. -
On the other hand, smaller segments could benefit more from segment elimination.
As a result, queries with highly selective filters on the columnstore index run faster with smaller segments.
In SingleStore, the default size of the row segments is controlled by the global variable columnstore_
.columnstore_
has a value of 1024000
, meaning that each segment contains 1024000
rows by default.
Note
The variable columnar_
from previous SingleStore versions was deprecated in favor of columnstore_
.columnar_
exists as an alias to columnstore_
.
A cluster upgraded from a version before 6.102400
for columnstore_
.
Refer to Upgrade to SingleStore 7.
In addition to the global variable, it is possible to overwrite the global setting and set the segment size for a specific columnstore table.
CREATE TABLE t (id INT, SORT KEY (id) WITH (columnstore_segment_rows=100000));SHOW CREATE TABLE t;
+-------+-------------------------------------------------------------------------+
| Table | Create Table |
+-------+-------------------------------------------------------------------------+
| t | CREATE TABLE `t` ( |
| | `id` int(11) DEFAULT NULL, |
| | SORT KEY `id` (`id`) |
| | /*!90619 */ /*!90621 WITH(COLUMNSTORE_SEGMENT_ROWS=100000) */ |
| | ) |
+-------+-------------------------------------------------------------------------+
Now, modify the value of columnstore_
for this columnstore.MODIFY KEY keyName
clause identifies the index whose settings are to be changed (in this case, the columnstore index).
ALTER TABLE t MODIFY KEY id SET (columnstore_segment_rows=20000);SHOW CREATE TABLE t;
+-------+-----------------------------------------------------------------------------------+
| Table | Create Table |
+-------+-----------------------------------------------------------------------------------+
| t | CREATE TABLE `t` ( |
| | `id` int(11) DEFAULT NULL, |
| | SORT KEY `id` (`id`) |
| | /*!90619 */ /*!90621 WITH(COLUMNSTORE_SEGMENT_ROWS=20000) */ |
| | ) |
+-------+-----------------------------------------------------------------------------------+
Configuring the Rowstore-backed Segment Size in Columnstore Tables
In addition to the on-disk segments, each SingleStore columnstore also has an in-memory rowstore-backed segment.
In some cases, inserts and updates will write to the rowstore-backed segment first.background flusher
process periodically compresses those recently inserted rows and creates on-disk segments.
The size of the rowstore-backed segment is controlled by the global variable columnstore_
.background flusher
process starts to create on-disk segments when the amount of data in the rowstore-backed segment exceeds columnstore_
, with a default of 32 MB
.columnstore_
, which is by default 16 MB
at a time.internal_
.
The minimum size of the disk-backed row segment created by insert and load operations is controlled by the engine variable columnstore_
.0.
.OPTIMIZE TABLE FLUSH
is manually run, the minimum segment size can be much smaller.
Considerations about tuning the rowstore-backed segment size include:
-
The rowstore-backed segment is stored in-memory.
Therefore, the table consumes less memory when the rowstore-backed segment is smaller. -
The
background flusher
process can write more rows to disk at once if the rowstore-backed segment is larger, reducing the number of disk write operations during data ingestion.
Similar to columnstore_
, the columnstore_
value can also be configured per-table with the following syntax:
CREATE TABLE t2 (id INT, SORT KEY (id) WITH (columnstore_flush_bytes=4194304, columnstore_segment_rows=100000));ALTER TABLE t2 MODIFY KEY id SET (columnstore_flush_bytes=8388608);
Last modified: May 28, 2024