Pipelines Scheduling

SingleStoreDB Cloud supports running multiple pipelines in parallel. Pipelines will be run in parallel until all SingleStoreDB Cloud partitions have been saturated. For example, consider a SingleStoreDB Cloud workspace with 10 partitions. With this architecture, it is possible to run 5 parallel pipelines using 2 partitions each, 2 pipelines using 5 partitions each, and so on.

If the partition requirements of any two pipelines exceed the total number of SingleStore partitions, each pipeline will be run serially in a round robin fashion. For example, consider a SingleStoreDB Cloud workspace with 10 partitions, and 3 pipelines. Let's say the first batch of pipelines P1, P2, and P3 requires 4, 8, and 4 partitions, respectively. The pipelines are scheduled concurrently with the aim of saturating the partitions in a SingleStoreDB Cloud workspace. Hence, the scheduler will run pipelines P1 and P3 in parallel to process their first batch. And then, it will run pipeline P2 serially, because the sum of number of partitions required by P2 and any other pipeline (P1 or P3) is greater than the number of partitions in the workspace (10 partitions). You can specify the maximum number of pipeline batch partitions that can run concurrently using the pipelines_max_concurrent_batch_partitions engine variable. Note that how many partitions a pipeline uses is dependent on the pipeline source.

Last modified: June 22, 2022

Was this article helpful?