Flexible Parallelism

SingleStore supports Flexible Parallelism, which allows multiple cores on the same node to access the same database partition. To use Flexible Parallelism, it must be enabled prior to database creation.

With Flexible Parallelism, as database partitions are created, they are divided into sub-partitions. As a query runs on a leaf node, multiple cores working on behalf of the query can process different sub-partitions of a partition in parallel.

As an example, if you are currently at one partition per core with Flexible Parallelism, doubling the size of your cluster and then rebalancing will result in two cores for each partition. As a result, a simple query that scans and aggregates all the data in a single partition will now execute more quickly than it did before.

Typically, analytic or hybrid transactional/analytic workloads will gain some benefits from Flexible Parallelism, especially over time as you add more hardware and rebalance clusters. Strictly OLTP workloads will tend to not benefit from enabling Flexible Parallelism.

As you add cores to your SingleStore system, you can reach a point where you have more cores than database partitions on each leaf node. Without Flexible Parallelism, there is a limit of one core accessing each database partition; therefore, the maximum ratio of cores to partitions is 1:1. This can lead to a kind of bottleneck as you can have idle cores on a node where you have queries accessing a single partition.

Note

When Flexible Parallelism is enabled, all new databases created will use Flexible Parallelism. Existing databases use the maximum of one thread per partition. You cannot enable or disable Flexible Parallelism on existing databases.

Note

SingleStore recommends lowering parallelism by setting query_parallelism_per_leaf_core = .01 for high-throughput search workloads. This setting will limit parallelism to one thread per query per leaf node.

The default setting query_parallelism_per_leaf_core = 1 indicates a parallelism of one thread per core for each query. With this setting all cores would be used for a query.

Flexible Parallelism is disabled by default in SingleStore in version 7.8 and previous.

Enabling and Configuring Flexible Parallelism

You cannot enable or disable Flexible Parallelism on existing databases. Making a database capable of supporting Flexible Parallelism can only be done at the time the database is created. If you want to make Flexible Parallelism available for an existing database, create a database with Flexible Parallelism enabled and copy the tables from the original database to the new one. One way to do this is with CREATE TABLE LIKE and INSERT…SELECT.

Flexible Parallelism is enabled and configured via two engine variables. The sub_to_physical_partition_ratio engine variable enables Flexible Parallelism, and query_parallelism_per_leaf_core controls whether queries use flexible parallelism or not.

Set sub_to_physical_partition_ratio to a valid value greater than 0 (a power of 2 up to 64) to enable Flexible Parallelism. This engine variable sets the number of sub-partitions per physical partition that a new database will be created with.

When a database is created with sub_to_physical_partition_ratio set to a valid value greater than 0, it will support Flexible Parallelism for queries over columnstore tables in that database. The maximum query parallelism is sub_to_physical_partition_ratio * number of physical partitions.

Once Flexible Parallelism is enabled, you can configure CPU usage per query on each leaf node.

To control the fraction of the CPU power that will be dedicated to one query on each leaf node, set the query_parallelism_per_leaf_core engine variable on the aggregator.

For example, if you want to use half of the available cores on each leaf node to process one query, then set query_parallelism_per_leaf_core to 0.5.

If you want to limit the parallelism to use a specific number of cores to process each query, rather than just a proportion of the number of cores, then you will need to know how many cores there are on each leaf. This can be determined by reading value of the expected_leaf_core_count variable. This variable is set during installation and should not be changed during normal operation.

The degree of parallelism on each node is calculated as expected_leaf_core_count * query_parallelism_per_leaf_core, rounded to the nearest integer.

Suppose expected_leaf_core_count is 28 and you want to use 10 cores on each leaf node to process a query. Dividing 10 by 28 gives 0.35714. To make sure you get 10 cores processing each query, set query_parallelism_per_leaf_core to .35714.

How to Tell When Flexible Parallelism is Being Used

You can determine if Flexible Parallelism is enabled for a particular database by looking at the sub-partition count in the Information Schema DISTRIBUTED_DATABASES or MV_DISTRIBUTED_DATABASES_STATUS views.

SELECT * FROM information_schema.DISTRIBUTED_DATABASES;
+-------------+---------------+----------------+--------------------+------------+-------------+---------+--------------------+-----------+---------+
| DATABASE_ID | DATABASE_NAME | NUM_PARTITIONS | NUM_SUB_PARTITIONS | CLUSTER_ID | REMOTE_NAME | IS_SYNC | IS_SYNC_DURABILITY | IS_PAUSED | SITE_ID |
+-------------+---------------+----------------+--------------------+------------+-------------+---------+--------------------+-----------+---------+
|           4 | db            |              3 |                 24 |          0 |             |       1 |                  0 |         0 |       1 |
+-------------+---------------+----------------+--------------------+------------+-------------+---------+--------------------+-----------+---------+

In these views, databases with a non-zero value in the NUM_SUB_PARTITIONS column have Flexible Parallelism enabled. Enabling Flexible Parallelism at the database level makes it possible for segment and sub-partition parallelism to be used for applicable queries.

You can determine if a specific query is using Flexible Parallelism by using EXPLAIN or PROFILE.  The output of these commands shows the parallelism_level for operations like Gather, GatherMerge, Project, Repartition, Broadcast, and MaterializedCTE.

Example output from EXPLAIN for a SELECT * query is below.

EXPLAIN SELECT * FROM example_table;
+--------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                          |
+--------------------------------------------------------------------------------------------------+
| Gather partitions:all alias:remote_0 parallelism_level:segment                                   |
| Project [example_table.a, example_table.b]                                                       |
| ColumnStoreScan example_db.example_table, SORT KEY __UNORDERED () table_type:sharded_columnstore |
+--------------------------------------------------------------------------------------------------+

In this example, Gather partitions:all alias:remote_0 parallelism_level:segment indicates that parallelism_level is segment for the Gather operator in this query.

In general, the parallelism_level can be partition, sub_partition, or segment.

A parallelism_level of partition indicates one thread is used per database partition and that the query is not using Flexible Parallelism.

The values sub_partition and segment both indicate that Flexible Parallelism is being used. The difference between sub_partition and segment is the granularity of the work items in the query threads. In the sub_partition case, each thread scans a pre-computed set of sub-partitions. In the segment case, there is not a pre-computed split of work across threads; rather, when a thread finishes its work, that thread may be assigned another segment to scan.

Segment level parallelism allows threads that finish their work earlier to help other threads (at a segment granularity), while sub-partition level parallelism does not allow that behavior. In sub-partition level parallelism, each thread scans a set of sub-partitions assigned to it during the query. Segment level parallelism is preferred over sub-partition level parallelism, when possible, but segment level parallelism is only applicable to simple query shapes that don’t utilize the shard key.

There are certain situations when Flexible Parallelism is enabled (query_parallelism_per_leaf_core is not 0), but queries do not use Flexible Parallelism. For example, queries on rowstore tables cannot use Flexible Parallelism. In these cases, the value of parallelism_level will be partition.

Last modified: March 31, 2025

Was this article helpful?

Verification instructions

Note: You must install cosign to verify the authenticity of the SingleStore file.

Use the following steps to verify the authenticity of singlestoredb-server, singlestoredb-toolbox, singlestoredb-studio, and singlestore-client SingleStore files that have been downloaded.

You may perform the following steps on any computer that can run cosign, such as the main deployment host of the cluster.

  1. (Optional) Run the following command to view the associated signature files.

    curl undefined
  2. Download the signature file from the SingleStore release server.

    • Option 1: Click the Download Signature button next to the SingleStore file.

    • Option 2: Copy and paste the following URL into the address bar of your browser and save the signature file.

    • Option 3: Run the following command to download the signature file.

      curl -O undefined
  3. After the signature file has been downloaded, run the following command to verify the authenticity of the SingleStore file.

    echo -n undefined |
    cosign verify-blob --certificate-oidc-issuer https://oidc.eks.us-east-1.amazonaws.com/id/CCDCDBA1379A5596AB5B2E46DCA385BC \
    --certificate-identity https://kubernetes.io/namespaces/freya-production/serviceaccounts/job-worker \
    --bundle undefined \
    --new-bundle-format -
    Verified OK