Pipelines Scheduling

If you create multiple pipelines in SingleStore, they will all run in parallel. They will run in parallel until all SingleStore partitions have been saturated. You can use the variables in the following table to define the maximum number of partitions and the number of pipelines that can run at the same time.

Variables

Description

max_partitions_per_batch

Allows you to specify the maximum number of partitions for each batch. This variable can be modified for each pipeline by using the ALTER PIPELINE SET MAX_PARTITIONS_PER_BATCH command.

pipelines_max_concurrent_batch_partitions

Allows you to specify the maximum number of pipeline batch partitions that can run concurrently. This is a global variable.

pipelines_max_concurrent

Allows you to set the maximum number of pipelines running concurrently.

Note

The number of partitions that a pipeline uses is dependent on its source. For example, for Kafka pipelines, the number of batch partitions that can run concurrently can not exceed the number of Kafka topic partitions.

For example, consider a SingleStore database with 10 partitions. Without any constraints, it is possible to run 5 parallel pipelines using 2 partitions each, 2 pipelines using 5 partitions each, and so on.

If the partition requirements, (as set via max_partitions_per_batch) of any two pipelines exceed the total number of partitions, each pipeline will be run serially in a round robin fashion.

For example, consider a SingleStore database with 10 partitions, and 3 pipelines. Let's say the first batch of pipelines P1, P2, and P3 requires 4, 8, and 4 partitions, respectively. The pipelines are scheduled concurrently with the aim of saturating the partitions in a cluster. Hence, the scheduler will run pipelines P1 and P3 in parallel to process their first batch. And then, it will run pipeline P2 serially, because the sum of the number of partitions required by P2 and any other pipeline (P1 or P3) is greater than the number of partitions in the cluster (10 partitions).

In this same scenario, if the pipelines_max_concurrent_batch_partitions variable was set to 5, and the max_partitions_per_batch variable was not specified, then each pipeline P1, P2, and P3 will be run serially.

You can also use the pipelines_max_concurrent variable in this scenario. If the variable was set to 1, then each pipeline P1, P2, and P3 will be run serially as no two pipelines can run concurrently.

Last modified: September 2, 2024

Was this article helpful?

Verification instructions

Note: You must install cosign to verify the authenticity of the SingleStore file.

Use the following steps to verify the authenticity of singlestoredb-server, singlestoredb-toolbox, singlestoredb-studio, and singlestore-client SingleStore files that have been downloaded.

You may perform the following steps on any computer that can run cosign, such as the main deployment host of the cluster.

  1. (Optional) Run the following command to view the associated signature files.

    curl undefined
  2. Download the signature file from the SingleStore release server.

    • Option 1: Click the Download Signature button next to the SingleStore file.

    • Option 2: Copy and paste the following URL into the address bar of your browser and save the signature file.

    • Option 3: Run the following command to download the signature file.

      curl -O undefined
  3. After the signature file has been downloaded, run the following command to verify the authenticity of the SingleStore file.

    echo -n undefined |
    cosign verify-blob --certificate-oidc-issuer https://oidc.eks.us-east-1.amazonaws.com/id/CCDCDBA1379A5596AB5B2E46DCA385BC \
    --certificate-identity https://kubernetes.io/namespaces/freya-production/serviceaccounts/job-worker \
    --bundle undefined \
    --new-bundle-format -
    Verified OK