The Lifecycle of a Pipeline

The following list describes the lifecycle of a pipeline and shows the progression of a pipeline from CREATE and START, to error handling, and finally pipeline status information.

  1. The user creates a new pipeline using CREATE PIPELINE.

  2. The user starts the pipeline using START PIPELINE.

    Note

    Steps 3 to 8 refer to a batch, which is subset of data that the pipeline extracts from its data source. Steps 3 to 8 comprise a single batch operation, which will either succeed or fail. That is, if any step fails, the batch operation fails and rolls back.

  3. The pipeline extracts a batch from its data source. The pipeline's offsets are updated to reflect the current position in the data source.

  4. The pipeline optionally shapes (modifies) the batch, using one of these methods.

  5. If the pipeline is able to successfully process the batch, the pipeline loads the batch into one or more SingleStore tables.

  6. If an error occurs while a batch is running, the batch fails and its transactions are rolled back.

    • Each batch is retried at most pipelines_max_retries_per_batch_partition times.

    • If all retries are unsuccessful and pipelines_stop_on_error is set to ON, the pipeline stops. 

    • If all retries are unsuccessful but pipelines_stop_on_error is set to OFF, the pipeline continues and a new batch is processed. This batch includes the same files and/or objects as the first batch, excluding any files or objects that may have caused the error.

    For more information, refer to Troubleshoot Pipelines.

  7. The pipeline updates the FILE_STATE column in the information_schema.PIPELINES_FILES table, as follows:

    • Files and objects in the batch that the pipeline processed successfully are marked as Loaded.

    • Files and objects in the batch that the pipeline did not process successfully are marked as Skipped.

    A file or object that is marked as Loaded or Skipped will not be processed again by the pipeline, unless ALTER PIPELINE ... DROP FILE ... is run.

    The pipeline does not delete files nor objects from the data source.

  8. The pipeline checks if the data source contains new data. If it does, the pipeline processes another batch immediately by rerunning steps 3 to 7.

    If the data source does not contain more data, the pipeline waits for BATCH_INTERVAL milliseconds (which can be specified in the CREATE PIPELINE statement) before checking the data source for new data. If the pipeline finds new data at this point, the pipeline reruns steps 3 to 7.

Note

  • The user can stop a running pipeline using STOP PIPELINE. If this command is issued while a batch operation is running, the batch operation completes before the pipeline stops.

  • If a pipeline is stopped with the DETACH PIPELINE command, the data loading is stopped. However, the loading can be restarted with the START PIPELINE command and the data loading continues as before.

During a pipeline's lifecycle, the pipeline updates the pipelines tables in the information schema, at different times. Refer to Augmenting Your Data Warehouse to Accelerate BI for additional pipeline-related information schema tables, including information_schema.PIPELINES_FILES mentioned in step 7.

Last modified: May 9, 2025

Was this article helpful?

Verification instructions

Note: You must install cosign to verify the authenticity of the SingleStore file.

Use the following steps to verify the authenticity of singlestoredb-server, singlestoredb-toolbox, singlestoredb-studio, and singlestore-client SingleStore files that have been downloaded.

You may perform the following steps on any computer that can run cosign, such as the main deployment host of the cluster.

  1. (Optional) Run the following command to view the associated signature files.

    curl undefined
  2. Download the signature file from the SingleStore release server.

    • Option 1: Click the Download Signature button next to the SingleStore file.

    • Option 2: Copy and paste the following URL into the address bar of your browser and save the signature file.

    • Option 3: Run the following command to download the signature file.

      curl -O undefined
  3. After the signature file has been downloaded, run the following command to verify the authenticity of the SingleStore file.

    echo -n undefined |
    cosign verify-blob --certificate-oidc-issuer https://oidc.eks.us-east-1.amazonaws.com/id/CCDCDBA1379A5596AB5B2E46DCA385BC \
    --certificate-identity https://kubernetes.io/namespaces/freya-production/serviceaccounts/job-worker \
    --bundle undefined \
    --new-bundle-format -
    Verified OK