Flexible Parallelism

SingleStore supports Flexible Parallelism, which allows multiple cores on the same node to access the same database partition. To use Flexible Parallelism, it must be enabled prior to database creation.

With Flexible Parallelism, as database partitions are created, they are divided into sub-partitions. As a query runs on a leaf node, multiple cores working on behalf of the query can process different sub-partitions of a partition in parallel.

As an example, if you are currently at one partition per core with Flexible Parallelism, doubling the size of your cluster and then rebalancing will result in two cores for each partition. As a result, a simple query that scans and aggregates all the data in a single partition will now execute more quickly than it did before.

Typically, analytic or hybrid transactional/analytic workloads will gain some benefits from Flexible Parallelism, especially over time as you add more hardware and rebalance clusters. Strictly OLTP workloads will tend to not benefit from enabling Flexible Parallelism.

As you add cores to your SingleStore system, you can reach a point where you have more cores than database partitions on each leaf node. Without Flexible Parallelism, there is a limit of one core accessing each database partition; therefore, the maximum ratio of cores to partitions is 1:1. This can lead to a kind of bottleneck as you can have idle cores on a node where you have queries accessing a single partition.

Note

When Flexible Parallelism is enabled, all new databases created will use Flexible Parallelism. Existing databases use the maximum of one thread per partition. You cannot enable or disable Flexible Parallelism on existing databases.

Note

SingleStore recommends lowering parallelism by setting query_parallelism_per_leaf_core = .01 for high-throughput search workloads. This setting will limit parallelism to one thread per query per leaf node.

The default setting query_parallelism_per_leaf_core = 1 indicates a parallelism of one thread per core for each query. With this setting all cores would be used for a query.

Flexible Parallelism is disabled by default in SingleStore in version 7.8 and previous.

Enabling and Configuring Flexible Parallelism

You cannot enable or disable Flexible Parallelism on existing databases. Making a database capable of supporting Flexible Parallelism can only be done at the time the database is created. If you want to make Flexible Parallelism available for an existing database, create a database with Flexible Parallelism enabled and copy the tables from the original database to the new one. One way to do this is with CREATE TABLE LIKE and INSERT…SELECT.

Flexible Parallelism is enabled and configured via two engine variables. The sub_to_physical_partition_ratio engine variable enables Flexible Parallelism, and query_parallelism_per_leaf_core controls whether queries use flexible parallelism or not.

Set sub_to_physical_partition_ratio to a valid value greater than 0 (a power of 2 up to 64) to enable Flexible Parallelism. This engine variable sets the number of sub-partitions per physical partition that a new database will be created with.

When a database is created with sub_to_physical_partition_ratio set to a valid value greater than 0, it will support Flexible Parallelism for queries over columnstore tables in that database. The maximum query parallelism is sub_to_physical_partition_ratio * number of physical partitions.

Once Flexible Parallelism is enabled, you can configure CPU usage per query on each leaf node.

To control the fraction of the CPU power that will be dedicated to one query on each leaf node, set the query_parallelism_per_leaf_core engine variable on the aggregator.

For example, if you want to use half of the available cores on each leaf node to process one query, then set query_parallelism_per_leaf_core to 0.5.

If you want to limit the parallelism to use a specific number of cores to process each query, rather than just a proportion of the number of cores, then you will need to know how many cores there are on each leaf. This can be determined by reading value of the expected_leaf_core_count variable. This variable is set during installation and should not be changed during normal operation.

The degree of parallelism on each node is calculated as expected_leaf_core_count * query_parallelism_per_leaf_core, rounded to the nearest integer.

Suppose expected_leaf_core_count is 28 and you want to use 10 cores on each leaf node to process a query. Dividing 10 by 28 gives 0.35714. To make sure you get 10 cores processing each query, set query_parallelism_per_leaf_core to .35714.

How to Tell When Flexible Parallelism is Enabled

At the database level, the sub-partition count is available in both the Information Schema distributed_databases and view mv_distributed_databases_status tables.

SELECT * FROM information_schema.distributed_databases;
+-------------+---------------+----------------+--------------------+------------+-------------+---------+--------------------+-----------+---------+
| DATABASE_ID | DATABASE_NAME | NUM_PARTITIONS | NUM_SUB_PARTITIONS | CLUSTER_ID | REMOTE_NAME | IS_SYNC | IS_SYNC_DURABILITY | IS_PAUSED | SITE_ID |
+-------------+---------------+----------------+--------------------+------------+-------------+---------+--------------------+-----------+---------+
|           4 | db            |              3 |                 24 |          0 |             |       1 |                  0 |         0 |       1 |
+-------------+---------------+----------------+--------------------+------------+-------------+---------+--------------------+-----------+---------+

In the NUM_SUB_PARTITIONS column, the databases with a non-zero value have Flexible Parallelism enabled.

At the query level, you can tell whether a query is using Flexible Partitioning via Query Plan Operators. When node_degree_of_parallelism is set, the parallelism level is shown in Explain or Profile operations like Gather, GatherMerge, Project, Repartition, Broadcast, and MaterializedCTE where parallelism_level is either partition, subpartition, or segment:

partitions:all parallelism_level:segment
partitions:all parallelism_level:partition
partitions:all parallelism_level:subpartition

A value of partition indicates one thread per database partition (Flexible Parallelism is not enabled). When query_parallelism_per_leaf_core is not 0 and the query does not support Flexible Parallelism, the value of parallelism_level in the query profile will be partition. One example where this occurs is a query on rowstore tables, but there can be other cases as well.

The values sub_partition and segment both mean Flexible Parallelism is used for the node. The difference between sub_partition and segment is the granularity of work items in the query threads. In the sub_partition case, each thread scans a pre-computed range (a set of sub-partitions). For the segment case, there is not a pre-computed split of work across threads; therefore, as each thread finishes its work, it will try to get one more segment to scan.

Basically, segment level parallelism allows threads that finish their work earlier to help with other threads (at a segment granularity), while sub-partition level parallelism does not allow that (each thread scans a set of sub-partitions assigned to it during the query).

parallelism_level can be segment in a query profile when the engine variable sub_to_physical_partition_ratio is 0 and a database is created. This means that the query is executed with segment-level flexible parallelism. The engine will start several threads as specified by query_parallelism_per_leaf_core and all the threads will be pulled from a shared queue of segments to scan, one segment at a time. Segment-level Flexible Parallelism is only supported for a small subset of query shapes. To disable segment-level Flexible Parallelism set query_parallelism_per_leaf_core to 0.

If parallelism_level is not shown in the explain or profile, it means that query_parallelism_per_leaf_core is set to 0. In those cases, the query is always executed with a parallelism_level of partition.

Sub-segment Elimination

Sub-segment elimination improves performance by eliminating data at the sub-segment level during query processing and thus extends the performance improvements provided by segment elimination.

Segments consist of blocks of 4096 rows each. With sub-segment elimination, statistics are collected for blocks and are used during query processing to eliminate blocks. Sub-segment elimination is restricted to sort key columns.

A segment with the default size (columnstore_segment_rows = 1,024,000) would have 1,024,000 / 4,096 = 250 sub-segments. Queries with highly selective predicates using flexible parallelism are expected to see performance improvements due to sub-segment elimination.

Consider a table and a query as follows:

CREATE TABLE sales(id INT, dt DATETIME, itemnum INT, qty FLOAT, price numeric(18,4), SORT KEY(dt));
SELECT itemnum, SUM(qty*price)
FROM sales
WHERE dt BETWEEN "2024-09-20 17:00:00" AND "2024-09-20 17:01:00"
GROUP BY all;

This query has a very narrow time range (a highly selective predicate) on the dt column, and the dt column is the sort key. Sub-segment elimination will be applied to this query and is expected to improve the performance of this query.

Last modified: November 5, 2024

Was this article helpful?