Flexible Parallelism
On this page
SingleStore supports Flexible Parallelism, which allows multiple cores on the same node to access the same database partition.
With Flexible Parallelism, as database partitions are created, they are divided into sub-partitions.
As an example, if you are currently at one partition per core with Flexible Parallelism, doubling the size of your cluster and then rebalancing will result in two cores for each partition.
Typically, analytic or hybrid transactional/analytic workloads will gain some benefits from Flexible Parallelism, especially over time as you add more hardware and rebalance clusters.
As you add cores to your SingleStore system, you can reach a point where you have more cores than database partitions on each leaf node.
Note
When Flexible Parallelism is enabled, all new databases created will use Flexible Parallelism.
Note
SingleStore recommends lowering parallelism by setting query_
for high-throughput search workloads.
The default setting query_
indicates a parallelism of one thread per core for each query.
Flexible Parallelism is disabled by default in SingleStore in version 7.
Enabling and Configuring Flexible Parallelism
You cannot enable or disable Flexible Parallelism on existing databases.CREATE TABLE LIKE
and INSERT…SELECT
.
Flexible Parallelism is enabled and configured via two engine variables.sub_
engine variable enables Flexible Parallelism, and query_
controls whether queries use flexible parallelism or not.
Set sub_
to a valid value greater than 0 (a power of 2 up to 64) to enable Flexible Parallelism.
When a database is created with sub_
set to a valid value greater than 0, it will support Flexible Parallelism for queries over columnstore tables in that database.sub_
.
Once Flexible Parallelism is enabled, you can configure CPU usage per query on each leaf node.
To control the fraction of the CPU power that will be dedicated to one query on each leaf node, set the query_
engine variable on the aggregator.
For example, if you want to use half of the available cores on each leaf node to process one query, then set query_
to 0.
If you want to limit the parallelism to use a specific number of cores to process each query, rather than just a proportion of the number of cores, then you will need to know how many cores there are on each leaf.expected_
variable.
The degree of parallelism on each node is calculated as expected_
, rounded to the nearest integer.
Suppose expected_
is 28 and you want to use 10 cores on each leaf node to process a query.query_
to .
How to Tell When Flexible Parallelism is Enabled
At the database level, the sub-partition count is available in both the Information Schema distributed_
and view mv_
tables.
SELECT * FROM information_schema.distributed_databases;
+-------------+---------------+----------------+--------------------+------------+-------------+---------+--------------------+-----------+---------+
| DATABASE_ID | DATABASE_NAME | NUM_PARTITIONS | NUM_SUB_PARTITIONS | CLUSTER_ID | REMOTE_NAME | IS_SYNC | IS_SYNC_DURABILITY | IS_PAUSED | SITE_ID |
+-------------+---------------+----------------+--------------------+------------+-------------+---------+--------------------+-----------+---------+
| 4 | db | 3 | 24 | 0 | | 1 | 0 | 0 | 1 |
+-------------+---------------+----------------+--------------------+------------+-------------+---------+--------------------+-----------+---------+
In the NUM_
column, the databases with a non-zero value have Flexible Parallelism enabled.
At the query level, you can tell whether a query is using Flexible Partitioning via Query Plan Operators.node_
is set, the parallelism level is shown in Explain or Profile operations like Gather, GatherMerge, Project, Repartition, Broadcast, and MaterializedCTE where parallelism_
is either partition
, subpartition
, or segment
:
partitions:all parallelism_level:segment
partitions:all parallelism_level:partition
partitions:all parallelism_level:subpartition
A value of partition
indicates one thread per database partition (Flexible Parallelism is not enabled).query_
is not 0 and the query does not support Flexible Parallelism, the value of parallelism_
in the query profile will be partition.
The values sub_
and segment
both mean Flexible Parallelism is used for the node.sub_
and segment
is the granularity of work items in the query threads.sub_
case, each thread scans a pre-computed range (a set of sub-partitions).
Basically, segment level parallelism allows threads that finish their work earlier to help with other threads (at a segment granularity), while sub-partition level parallelism does not allow that (each thread scans a set of sub-partitions assigned to it during the query).
parallelism_
can be segment
in a query profile when the engine variable sub_
is 0 and a database is created.query_
and all the threads will be pulled from a shared queue of segments to scan, one segment at a time.query_
to 0.
If parallelism_
is not shown in the explain or profile, it means that query_
is set to 0.parallelism_
of partition.
Sub-segment Elimination
Sub-segment elimination improves performance by eliminating data at the sub-segment level during query processing and thus extends the performance improvements provided by segment elimination.
Segments consist of blocks of 4096 rows each.
A segment with the default size (columnstore_
) would have 1,024,000 / 4,096 = 250 sub-segments.
Consider a table and a query as follows:
CREATE TABLE sales(id INT, dt DATETIME, itemnum INT, qty FLOAT, price numeric(18,4), SORT KEY(dt));SELECT itemnum, SUM(qty*price)FROM salesWHERE dt BETWEEN "2024-09-20 17:00:00" AND "2024-09-20 17:01:00"GROUP BY all;
This query has a very narrow time range (a highly selective predicate) on the dt
column, and the dt
column is the sort key.
Last modified: November 5, 2024