Flexible Parallelism

As cores are added to a SingleStore system, it is possible to reach a point where there are more cores than database partitions on each leaf node. Without FP, there is a limit of one core accessing each database partition and the maximum ratio of cores to partitions is 1:1. With FP, if additional cores are added to a database, multiple cores can be used to process a single partition. Thus, queries that process large amounts of data execute more quickly.

Typically, analytic or hybrid transactional/analytic workloads gain benefits from FP, especially over time as you add more hardware and rebalance clusters. Strictly OLTP workloads typically do not benefit from FP.

Types of Flexible Parallelism

There are two types of FP in SingleStore: sub-partition parallelism and segment parallelism. Both types of FP enable multiple cores on a leaf to process different portions of a partition in parallel.

Sub-Partition Parallelism

Sub-partition parallelism is available on databases with sub-partitions. During query execution, the query threads on a leaf node are each assigned a set of sub-partitions, and the query threads scan their assigned sub-partitions in parallel. A common configuration is to specify one query thread per core, and in that configuration, each core scans a set of sub-partitions.

Sub-partitions are divisions of database partitions. The engine variable sub_to_physical_partition_ratio variable controls the number of sub-partitions that are created per physical partition. When sub_to_physical_partition_ratio is set to a valid value greater than 0 (a power of 2 up to 64) before a database is created, sub-partitions are created and sub-partition parallelism is available to queries on columnstore tables in that database.

Sub-partitions must be created at database creation time and cannot be created on existing databases. A database created without sub-partitions cannot use sub-partition parallelism.

To add sub-partitions to an existing database, create a database with sub-partitions and copy the tables from the original database to the new one. One way to do this is with CREATE TABLE LIKE and INSERT… SELECT.

The maximum sub-partition parallelism is sub_to_physical_partition_ratio * number of physical partitions.

Segment Parallelism

For certain query shapes, the engine processes segments of columnstore tables in parallel. This functionality is called segment parallelism. In segment parallelism, there is no pre-computed split of work across threads; instead, when a thread finishes its work, that thread may be assigned to another segment to scan.

Segment parallelism is controlled by the query_parallelism_per_leaf_core engine variable as described in the following sections.

Sub-Partition vs. Segment Parallelism

Segment parallelism differs from sub-partition parallelism in the granularity of the work done by the query threads and in the queries to which parallelism may be applied.

In sub-partition parallelism, each query thread scans a pre-computed set of sub-partitions. Whereas, in segment parallelism, when a query thread finishes its work, that thread may be assigned to another segment to scan. Thus, segment parallelism allows threads that finish their work earlier to help other threads (at a segment granularity), while sub-partition level parallelism does not allow that behavior.

Segment parallelism is preferred over sub-partition level parallelism, when possible, but segment parallelism is only applicable to simple query shapes that do not utilize the shard key.

Configure Flexible Parallelism

FP is configured with the following two engine variables:

sub_to_physical_partition_ratio: specifies the number of sub-partitions created when the database is created.
query_parallelism_per_leaf_core is a session variable that controls per session, at query time, the level of parallelism.

The following table summarizes when sub-partition and segment parallelism are used.

Database State		Types of FP Used
`query_parallelism_ per_leaf_core`	Database has sub-partitions	Sub-partition Parallelism	Segment Parallelism
> 0	Yes	✔	✔
> 0	No	✘	✔
0	Not Applicable	✘	✘

FP can be disabled for a session by setting query_parallelism_per_leaf_core = 0.

In certain situations, queries do not use FP query_parallelism_per_leaf_core is not 0. For example, queries on rowstore tables cannot use FP.

FP is disabled by default in SingleStore in version 7.8 and previous.

Choose the Value of `query_parallelism_per_leaf_core`

The engine variable query_parallelism_per_leaf_core controls the fraction of the CPU power that is dedicated to a query on each leaf node. This variable must be set on the aggregator.

For example, if you want to use half of the available cores on each leaf node to process one query, set query_parallelism_per_leaf_core to 0.5.

If you want to limit the parallelism to use a specific number of cores to process each query, instead of a proportion of the number of cores, you need to know the number of cores on each leaf. The number of cores on each leaf can be determined by reading the value of the expected_leaf_core_count engine variable. This variable is set during installation and must not be changed during normal operation.

The degree of parallelism on each node is calculated as expected_leaf_core_count * query_parallelism_per_leaf_core, rounded to the nearest integer.

For example, if expected_leaf_core_count is 28 and you want to use 10 cores on each leaf node to process a query:

Divide 10 by 28. 10 / 28 = 0.35714.
To have 10 cores processing each query, set query_parallelism_per_leaf_core to .35714.

Note

For high-throughput search workloads, SingleStore recommends lowering parallelism by setting query_parallelism_per_leaf_core = .01. This setting limits parallelism to one thread per query per leaf node.

The default setting query_parallelism_per_leaf_core = 1 indicates a parallelism of one thread per core for each query, that is all cores are used for a query.

Check if a Database has Sub-Partitions

To determine if a database has sub-partitions, view the value of NUM_SUB_PARTITIONS in the DISTRIBUTED_DATABASES or MV_DISTRIBUTED_DATABASES_STATUS information schema views.

In these views, databases with a non-zero value in the NUM_SUB_PARTITIONS column have sub-partitions and sub-partition parallelism is available on those databases. Databases without sub-partitions are not eligible to use sub-partition parallelism.

SQL

SELECT * FROM information_schema.DISTRIBUTED_DATABASES;

+-------------+---------------+----------------+--------------------+------------+-------------+---------+--------------------+-----------+---------+
| DATABASE_ID | DATABASE_NAME | NUM_PARTITIONS | NUM_SUB_PARTITIONS | CLUSTER_ID | REMOTE_NAME | IS_SYNC | IS_SYNC_DURABILITY | IS_PAUSED | SITE_ID |
+-------------+---------------+----------------+--------------------+------------+-------------+---------+--------------------+-----------+---------+
|           4 | db            |              3 |                 24 |          0 |             |       1 |                  0 |         0 |       1 |
+-------------+---------------+----------------+--------------------+------------+-------------+---------+--------------------+-----------+---------+

Check What Level of Parallelism a Query is Using

You can determine if a specific query is using FP by using EXPLAIN or PROFILE. The output of these commands shows the parallelism_level for operations like Gather, GatherMerge, Project, Repartition, Broadcast, and MaterializedCTE.

Example output from EXPLAIN for a SELECT * query:

SQL

EXPLAIN SELECT * FROM example_table;

+--------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                          |
+--------------------------------------------------------------------------------------------------+
| Gather partitions:all alias:remote_0 parallelism_level:segment                                   |
| Project [example_table.a, example_table.b]                                                       |
| ColumnStoreScan example_db.example_table, SORT KEY __UNORDERED () table_type:sharded_columnstore |
+--------------------------------------------------------------------------------------------------+

In this example, Gather partitions:all alias:remote_0 parallelism_level:segment indicates that parallelism_level is segment for the Gather operator in this query.

In general, the parallelism_level can be partition, sub_partition, or segment.

A parallelism_level of partition indicates one thread is used per database partition and that the query is not using FP.
A value of sub_partition indicates sub-partition parallelism is being used.

A value of segment indicates that segment parallelism is being used.

As discussed earlier, segment parallelism is preferred over sub-partition level parallelism and is used when possible.

For more detailed information on FP, read this blog: Flexible Parallelism in SingleStoreDB.

Flexible Parallelism

On this page

Types of Flexible Parallelism

Sub-Partition Parallelism

Segment Parallelism

Sub-Partition vs. Segment Parallelism

Configure Flexible Parallelism

Choose the Value of `query_parallelism_per_leaf_core`

Check if a Database has Sub-Partitions

Check What Level of Parallelism a Query is Using

Was this article helpful?

On this page

Was this article helpful?

Flexible Parallelism

On this page

Types of Flexible Parallelism

Sub-Partition Parallelism

Segment Parallelism

Sub-Partition vs. Segment Parallelism

Configure Flexible Parallelism

Choose the Value of query_parallelism_per_leaf_core

Check if a Database has Sub-Partitions

Check What Level of Parallelism a Query is Using

Related Topics

Was this article helpful?

On this page

Was this article helpful?

Choose the Value of `query_parallelism_per_leaf_core`