Flexible Parallelism

SingleStore supports Flexible Parallelism, which allows multiple cores on the same node to access the same database partition.

With Flexible Parallelism, as database partitions are created they are divided into sub-partitions. As a query runs on a leaf node, multiple cores working on behalf of the query can process different sub-partitions of a partition in parallel.

As an example, if you are currently at one partition per core with Flexible Parallelism, doubling the size of your cluster and then rebalancing will result in two cores for each partition. As a result, a simple query that scans and aggregates all the data in a single partition will now execute more quickly than it did before.

Typically, analytic or hybrid transactional/analytic workloads will gain some benefits from Flexible Parallelism, especially over time as you add more hardware and rebalance clusters. Strictly OLTP workloads will tend to not benefit from enabling Flexible Parallelism.

As you add cores to your SingleStore system, you can reach a point where you have more cores than database partitions on each leaf node. Without Flexible Parallelism, there is a limit of one core accessing each database partition; therefore, the maximum ratio of cores to partitions is 1:1. This can lead to a kind of bottleneck as you can have idle cores on a node where you have queries accessing a single partition.

Notice

When Flexible Parallelism is enabled, all new databases created will use Flexible Parallelism. Existing databases use the maximum of one thread per partition.

Flexible Parallelism is disabled by default in SingleStore DB.

Enabling and Configuring Flexible Parallelism

Flexible Parallelism is enabled via the sub_to_physical_partition_ratio engine variable. Set it to a valid value greater than 0 (a power of 2 up to 64) to enable Flexible Parallelism. This engine variable sets the number of sub-partitions per physical partition that a new database will be created with.

When a database is created with sub_to_physical_partition_ratio set to a valid value greater than 0, it will support Flexible Parallelism for queries over columnstore tables in that database. The maximum query parallelism is sub_to_physical_partition_ratio * number of physical partitions.

Once Flexible Parallelism is enabled, you can configure CPU usage per query on each leaf node.

To control the fraction of the CPU power that will be dedicated to one query on each leaf node, set the query_parallelism_per_leaf_core engine variable.

For example, if you want to use half of the available cores on each leaf node to process one query, then set query_parallelism_per_leaf_core to 0.5.

If you want to limit the parallelism to use a specific number of cores to process each query, rather than just a proportion of the number of cores, then you will need to know how many cores there are on each leaf. This can be determined by reading value of the expected_leaf_core_count variable. This variable is set during installation and should not be changed during normal operation.

The degree of parallelism on each node is calculated as expected_leaf_core_count * query_parallelism_per_leaf_core, rounded to the nearest integer.

Suppose expected_leaf_core_count is 28 and you want to use 10 cores on each leaf node to process a query. Dividing 10 by 28 gives 0.35714. To make sure you get 10 cores processing each query, set query_parallelism_per_leaf_core to .35714.

How to Tell When Flexible Parallelism is Enabled

At the database level, the sub-partition count is available in both the Information Schema distributed_databases and view mv_distributed_databases_status tables.

singlestore> SELECT * FROM information_schema.distributed_databases;
+-------------+---------------+----------------+--------------------+------------+-------------+---------+--------------------+-----------+---------+
| DATABASE_ID | DATABASE_NAME | NUM_PARTITIONS | NUM_SUB_PARTITIONS | CLUSTER_ID | REMOTE_NAME | IS_SYNC | IS_SYNC_DURABILITY | IS_PAUSED | SITE_ID |
+-------------+---------------+----------------+--------------------+------------+-------------+---------+--------------------+-----------+---------+
|           4 | db            |              3 |                 24 |          0 |             |       1 |                  0 |         0 |       1 |
+-------------+---------------+----------------+--------------------+------------+-------------+---------+--------------------+-----------+---------+

In the NUM_SUB_PARTITIONS column, the databases with a non-zero value have Flexible Parallelism enabled.

At the query level, you can tell whether a query is using Flexible Partitioning via Query Plan Operators. When node_degree_of_parallelism is set, the parallelism level is shown in Explain or Profile operations like Gather, GatherMerge, Project, Repartition, Broadcast, and MaterializedCTE where parallelism_level is either partition, subpartition, or segment:

partitions:all parallelism_level:segment
partitions:all parallelism_level:partition
partitions:all parallelism_level:subpartition

A value of partition indicates one thread per database partition (Flexible Parallelism is not enabled).

The values sub_partition and segment both mean Flexible Parallelism is used for the node. The difference between sub_partition and segment is the granularity of work items in the query threads. In the sub_partition case, each thread scans a pre-computed range (a set of sub-partitions). For the segment case, there is not a pre-computed split of work across threads; therefore, as each thread finishes its work, it will try to get one more segment to scan.

Basically, the segment level parallelism allows threads that finish their work earlier to help with other threads (at a segment granularity), while sub-partition level parallelism does not allow that (each thread scans a set of sub-partitions assigned to it during the query).