PIPELINES_BATCHES

The data in this view is not persistent across node restarts. After restarting a leaf node, the view will have batches with NULLs for the respective batch partitions. If the MA is restarted, all batch data is removed. To maintain this data across restarts, write the data to a separate table periodically.

The following pipeline engine variable is used to control an aspect of pipeline behavior. For more detailed information consult the Sync Variables Lists section of the Engine Variable List.

pipelines_batches_metadata_to_keep - controls the number of batch metadata entries to keep before they are overwritten by incoming batches.

information_schema.PIPELINES_BATCHES Schema

Name	Description
`DATABASE_NAME`	The name of the database associated with the batch.
`PIPELINE_NAME`	The name of the pipeline associated with the batch.
`BATCH_ID`	The internal unique identifier for the batch. While every row in the table contains unique information about a batch, multiple rows with the same `BATCH_ID` may be present.
`BATCH_STATE`	Specifies whether the batch transaction was successful. Possible values: `Succeeded`, `In Progress`, `Failed`, `Queued`, `Canceled`, or `No Data`. `Succeeded`: The batch was successfully loaded into the destination table. `In Progress`: The batch is currently being loaded into the destination table. `Failed`: The batch failed to be written. Batch failures are written as errors in the `information_schema.PIPELINES_ERRORS` table, and you can use the `BATCH_ID` to correlate a specific batch failure with more detailed error information. `Queued`: The batch is waiting for enough resources in the cluster to free up so it can run. `Canceled`: a pipeline batch's internal query has been killed. That can be triggered internally, by running `STOP PIPELINE` while a batch is running, or externally by CTRL-C-ing `START PIPELINE FOREGROUND`. `No Data`: The batch contained no data. Batches that don’t contain data will be deleted as new batches are written.
`BATCH_ROWS_WRITTEN`	The number of rows that were affected in the destination table as a result of the batch transaction. When batches are written to the destination table, they consist of insert, replace, upsert, or delete operations. The `BATCH_ROWS_WRITTEN` value is the sum total of all the operations that occurred when the batch was written. This value may be `NULL` if the `BATCH_STATE` is any value other than `Succeeded`.
`BATCH_TIME`	The elapsed time in seconds from the start of the batch transaction to its success or failure. This value reflects the complete transaction time from the perspective of the master aggregator node. Therefore, the elapsed time may be longer than a `BATCH_PARTITION_TIME` value for a given batch, as `BATCH_PARTITION_TIME` reflects the perspective of the leaf node.
`BATCH_START_UNIX_TIMESTAMP`	The time that the batch transaction started in Unix timestamp format.
`BATCH_PARTITION_STATE`	Specifies whether the batch transaction was successful for a given partition. There are five possible values: `Succeeded`, `In Progress`, `Failed`, `No Data`, or `NULL`. `Succeeded`: The batch was successfully loaded into the destination table. `In Progress`: The batch is currently being loaded into the destination table. `Failed`: The batch failed to be written. Batch failures are written as errors in the `information_schema.PIPELINES_ERRORS` table, and you can use the `BATCH_ID` and `PARTITION` values to correlate a specific batch failure with more detailed error information. `No Data`: The batch contained no data. Batches that don’t contain data will be deleted as new batches are written. `NULL`: The batch data either contains no data, or the batch transaction hasn’t yet been executed on the partition itself.
`BATCH_PARTITION_PARSED_ROWS`	The number of rows that were parsed during the batch transaction for a given partition. This value can be `NULL` if the batch transaction is still in progress or if the batch failed.
`BATCH_SOURCE_PARTITION_ID`	The identifier for a data source’s partition from which batch partition data is extracted. This value may be `NULL` if the batch’s `BATCH_PARTITION_STATE` is also `NULL`. For an S3 pipeline, this represents the name of an object in the source bucket.
`BATCH_EARLIEST_OFFSET`	Specifies the earliest offset for the batch partition. This value indicates the start of the offset range for a batch, while `BATCH_LATEST_OFFSET` indicates the end of the offset range. This value may be `NULL` if the batch’s `BATCH_PARTITION_STATE` is also `NULL`. For an S3 pipeline, this value will always be `0` when a bucket’s object has been successfully processed. For more information, see Load Data from Amazon Web Services (AWS) S3
`BATCH_LATEST_OFFSET`	Specifies the latest offset for the batch. This value indicates the end of the offset range for a batch, while `BATCH_EARLIEST_OFFSET` indicates the start of the offset range. This value may be `NULL` if the batch’s `BATCH_PARTITION_STATE` is also `NULL`. For an S3 pipeline, this value will always be `1` when a bucket’s object has been successfully processed. For more information, see Load Data from Amazon Web Services (AWS) S3
`BATCH_PARTITION_TIME`	The elapsed time in seconds from the start of the batch transaction to its success or failure. This value reflects the batch transaction time from the perspective of the leaf node. Therefore, the elapsed time may be shorter than a `BATCH_TIME` for a given batch, as `BATCH_TIME` reflects the perspective of the master aggregator node.
`BATCH_PARTITION_EXTRACTED_BYTES`	The number of bytes that was extracted from the data source for the batch. This value may be `NULL` if no data was extracted for the batch.
`BATCH_PARTITION_TRANSFORMED_BYTES`	The number of bytes that were transformed for the batch. This value may be `NULL` if a transform was not specified for the pipeline, or if no data was successfully transformed.
`BATCH_PARTITION_EXTRACTOR_WAIT_TIME`	The amount of time, in seconds, that the pipeline spent waiting on the extractor to complete its operation.
`BATCH_PARTITION_TRANSFORM_WAIT_TIME`	The amount of time, in seconds, that the pipeline spent waiting on the transform to complete its operation.
`HOST`	The hostname or host IP address for the leaf node that processed the batch. This value may be `NULL` if `BATCH_SOURCE_PARTITION_ID` or `PARTITION` is also `NULL`. The combination of a batch’s `HOST`, `PORT`, and `PARTITION` identify the specific leaf node partition that attempted to load batch data.
`PORT`	The port number for the leaf node that processed the batch. This value may be `NULL` if `BATCH_SOURCE_PARTITION_ID` or `PARTITION` is also `NULL`. The combination of a batch’s `HOST`, `PORT`, and `PARTITION` identify the specific leaf node partition that attempted to load batch data.
`PARTITION`	Specifies the partition ID on a leaf node that processed the batch. This value may be `NULL` if `BATCH_SOURCE_PARTITION_ID` is also `NULL`. The combination of a batch’s `HOST`, `PORT`, and `PARTITION` identify the specific leaf node partition that attempted to load batch data.

PIPELINES_BATCHES

On this page

information_schema.PIPELINES_BATCHES Schema

Was this article helpful?

On this page

Was this article helpful?