SingleStore DB

The Lifecycle of a Pipeline
  1. The user creates a new pipeline using CREATE PIPELINE.

  2. The user starts the pipeline using START PIPELINE.


    Steps 3,4, and 5 refer to a batch, which is subset of data that the pipeline extracts from its data source. These steps comprise one batch operation, which will succeed or fail completely. If any step fails, the batch operation rolls back.

  3. The pipeline extracts a batch from its data source. The pipeline's offsets are updated to reflect the current position in the data source.

  4. The pipeline optionally shapes (modifies) the batch, using one of three methods.

  5. The pipeline loads the batch into one or more SingleStore tables.

  6. Assuming the batch operation has succeeded, the pipeline checks if the data source contains new data. If it does, the pipeline processes another batch immediately by running steps 3, 4, and 5 again. If the data source does not contain more data, the pipeline waits for BATCH_INTERVAL milliseconds (which is specified in the CREATE PIPELINE statement) before checking the data source for new data. If the pipeline finds new data at this point, the pipeline runs steps 3, 4, and 5 again.


The user can stop a running pipeline using STOP PIPELINE. If this command is executed while a batch operation is executing, the batch operation completes before the pipeline stops.