Flexible Parallelism

As cores are added to a SingleStore system, it is possible to reach a point where there are more cores than database partitions on each leaf node. Without FP, there is a limit of one core accessing each database partition and the maximum ratio of cores to partitions is 1:1. With FP, if additional cores are added to a database, multiple cores can be used to process a single partition. Thus, queries that process large amounts of data execute more quickly.

Typically, analytic or hybrid transactional/analytic workloads gain benefits from FP, especially over time as you add more hardware and rebalance clusters. Strictly OLTP workloads typically do not benefit from FP.

Types of Flexible Parallelism

There are two types of FP in SingleStore: sub-partition parallelism and segment parallelism. Both types of FP enable multiple cores on a leaf to process different portions of a partition in parallel.

Sub-Partition Parallelism

Sub-partition parallelism is available on databases with sub-partitions. During query execution, the query threads on a leaf node are each assigned a set of sub-partitions, and the query threads scan their assigned sub-partitions in parallel. A common configuration is to specify one query thread per core, and in that configuration, each core scans a set of sub-partitions.

Sub-partitions are divisions of database partitions. The engine variable sub_to_physical_partition_ratio variable controls the number of sub-partitions that are created per physical partition. When sub_to_physical_partition_ratio is set to a valid value greater than 0 (a power of 2 up to 64) before a database is created, sub-partitions are created and sub-partition parallelism is available to queries on columnstore tables in that database.

Sub-partitions must be created at database creation time and cannot be created on existing databases. A database created without sub-partitions cannot use sub-partition parallelism.

To add sub-partitions to an existing database, create a database with sub-partitions and copy the tables from the original database to the new one. One way to do this is with CREATE TABLE LIKE and INSERT… SELECT.

The maximum sub-partition parallelism is sub_to_physical_partition_ratio * number of physical partitions.

Segment Parallelism

For certain query shapes, the engine processes segments of columnstore tables in parallel. This functionality is called segment parallelism. In segment parallelism, there is no pre-computed split of work across threads; instead, when a thread finishes its work, that thread may be assigned to another segment to scan.

Segment parallelism is controlled by the query_parallelism_per_leaf_core engine variable as described in the following sections.

Sub-Partition vs. Segment Parallelism

Segment parallelism differs from sub-partition parallelism in the granularity of the work done by the query threads and in the queries to which parallelism may be applied.

In sub-partition parallelism, each query thread scans a pre-computed set of sub-partitions. Whereas, in segment parallelism, when a query thread finishes its work, that thread may be assigned to another segment to scan. Thus, segment parallelism allows threads that finish their work earlier to help other threads (at a segment granularity), while sub-partition level parallelism does not allow that behavior.

Segment parallelism is preferred over sub-partition level parallelism, when possible, but segment parallelism is only applicable to simple query shapes that do not utilize the shard key.