Pipelines Scheduling

SingleStore supports running multiple pipelines in parallel. Pipelines will be run in parallel until all SingleStore partitions have been saturated. For example, consider a SingleStore database with 10 partitions. With this architecture, it is possible to run 5 parallel pipelines using 2 partitions each, 2 pipelines using 5 partitions each, and so on.

If the partition requirements (as set via MAX_PARTITIONS_PER_BATCH) of any two pipelines exceed the total number of SingleStore partitions, each pipeline will be run serially in a round robin fashion. For example, consider a SingleStore database with 10 partitions, and 3 pipelines. Let's say the first batch of pipelines P1, P2, and P3 requires 4, 8, and 4 partitions, respectively. The pipelines are scheduled concurrently with the aim of saturating the partitions in a SingleStore cluster. Hence, the scheduler will run pipelines P1 and P3 in parallel to process their first batch. And then, it will run pipeline P2 serially, because the sum of number of partitions required by P2 and any other pipeline (P1 or P3) is greater than the number of partitions in the cluster (10 partitions). You can specify the maximum number of pipeline batch partitions that can run concurrently using the pipelines_max_concurrent_batch_partitions engine variable. Note that how many partitions a pipeline uses is dependent on the pipeline source.

Last modified: March 29, 2024

Was this article helpful?