PIPELINES_ERRORS

This view contains detailed information about errors that occurred during extraction, transformation, or loading. Each row represents a single error event.

Note

This view contains exhaustive node-level error information that may originate at either a leaf node or aggregator node. If a row appears to be a duplicate, first verify that the row’s ERROR_MESSAGE entry does not start with LEAF ERROR (<host>:<port>). Error event information may be propagated from a leaf to an aggregator or from a leaf to another leaf. This text indicates the specific leaf node origin for the error.

The following pipeline engine variables are used to control various aspects of pipeline behavior. For more detailed information consult the Sync Variables Lists section of the Engine Variable List.

  • pipelines_errors_retention_minutes - controls the amount of time in minutes that a pipeline error is stored on disk.

  • ingest_errors_max_disk_space_mb - controls the  maximum amount of disk space (MB) that is used to log errors for pipelines.

information_schema.PIPELINES_ERRORS Schema

Column Name

Description

DATABASE_NAME

The name of the database associated with the error.

PIPELINE_NAME

The name of the pipeline associated with the error.

ERROR_UNIX_TIMESTAMP

The time of the error event in Unix timestamp format.

ERROR_TYPE

Specifies what type of error occurred. Possible values are Error or Warning. Error: An error occurred, which may have stopped the pipeline if the pipelines_stop_on_error variable is set to ON. Warning: A warning occurred, which does not stop a pipeline.

ERROR_CODE

The error code for the error.

ERROR_MESSAGE

The message associated with the error. This value contains contextual information about the error that can be used for debugging purposes, including a stack trace if the error was caused by a transform failure.

ERROR_KIND

Specifies whether the error event was caused by internal or external factors. Possible values are Internal, Extract, Transform, or Load. Internal: An internal error occurred within the Pipelines feature itself. Extract: An error occurred during extraction from the data source. Extraction errors are typically caused by network availability or problems with the data source partitions. Transform: An error occurred during data transformation. The transform executable is typically the cause of transformation errors. Load: An error occurred when attempting to load data into the destination table. Load errors are typically caused by malformed CSV data or attempting to write invalid schema types into a column, such as a NULL value into a non-nullable column.

STD_ERROR

The text that may have been outputted during the data transformation. The origin of this text is the transform executable itself. This value can be empty or NULL if the ERROR_KIND value is not Transform or if no standard error text was outputted by the transform during failure.

LOAD_DATA_LINE

The text of a LOAD DATA statement that caused a parsing error while attempting to load data into the destination table. This value contains the invalid line of CSV text that failed to parse. Load errors are typically caused by malformed CSV data or attempting to write invalid schema types into a column, such as a NULL value into a non-nullable column. This value can be empty or NULL if the error didn’t occur during the loading phase of pipeline execution.

LOAD_DATA_LINE_NUMBER

The line number of a LOAD DATA statement that caused a parsing error while attempting to load data into the destination table. A LOAD DATA statement may consist of many lines, and this value indicates the specific invalid line. This line number can be correlated with the value of LOAD_DATA_LINE, which contains the invalid line’s text. This value may be empty or NULL if a LOAD DATA statement wasn’t associated with the error event.

BATCH_ID

The internal unique identifier for the batch that experienced an error event.

ERROR_ID

The internal unique identifier for the error event.

BATCH_SOURCE_PARTITION_ID

The data source’s partition ID for the batch. This value may be NULL if the error occurred on the master aggregator.

BATCH_EARLIEST_OFFSET

Specifies the earliest offset for the batch. This value indicates the start of the offset range for a batch, while BATCH_LATEST_OFFSET indicates the end of the offset range. This value may be NULL if the error occurred on the master aggregator.

BATCH_LATEST_OFFSET

Specifies the latest offset for the batch. This value indicates the end of the offset range for a batch, while BATCH_EARLIEST_OFFSET indicates the start of the offset range. This value may be NULL if the error occurred on the master aggregator.

HOST

The hostname or host IP address for the leaf node that processed the batch. This value may be NULL if BATCH_SOURCE_PARTITION_ID or PARTITION is also NULL. The combination of a batch’s HOST, PORT, and PARTITION identify the specific leaf node partition that attempted to load batch data.

PORT

The port number for the leaf node that processed the batch. This value may be NULL if BATCH_SOURCE_PARTITION_ID or PARTITION is also NULL. The combination of a batch’s HOST, PORT, and PARTITION identify the specific leaf node partition that attempted to load batch data.

PARTITION

Specifies the partition ID on a leaf node that processed the batch. This value may be NULL if BATCH_SOURCE_PARTITION_ID is also NULL. The combination of a batch’s HOST, PORT, and PARTITION identify the specific leaf node partition that attempted to load batch data.

Last modified: March 4, 2024

Was this article helpful?