Debugging Pipeline Errors
On this page
Viewing pipeline errors
The Data Ingest provide information about pipeline errors that have occurred.
Query the information_ schema. PIPELINES_ ERRORS
table
You can run the following query to show all errors that have occurred, per database, per pipeline, per batch, and per partition.
SELECT DATABASE_NAME, PIPELINE_NAME, BATCH_ID, PARTITION, BATCH_SOURCE_PARTITION_ID,ERROR_KIND, ERROR_CODE, ERROR_MESSAGE, LOAD_DATA_LINE_NUMBER, LOAD_DATA_LINEFROM information_schema.PIPELINES_ERRORS;
Query files that were skipped
The query in the previous section does not show files that were skipped because they had errors.
SELECT * FROM information_schema.PIPELINES_FILES WHERE FILE_STATE = 'Skipped';
If you need additional information, such as the database, the partition, the error that was generated and the line of the error file or object that caused the issue, run the following query.
SELECT pe.DATABASE_NAME, pe.PIPELINE_NAME, pe.BATCH_ID, pe.PARTITION,pe.BATCH_SOURCE_PARTITION_ID, pe.ERROR_TYPE, pe.ERROR_KIND, pe.ERROR_CODE, pe.ERROR_MESSAGE,pe.LOAD_DATA_LINE_NUMBER, pe.LOAD_DATA_LINEFROM information_schema.PIPELINES_FILES pf, information_schema.PIPELINES_ERRORS peWHERE pe.BATCH_SOURCE_PARTITION_ID = pf.FILE_NAME and pf.FILE_STATE = 'Skipped';
Use CREATE PIPELINE… FORCE to Collect More Logs
When the CREATE PIPELINE
statement fails, more information can be found with the SHOW WARNINGS
command.
If SHOW WARNINGS
does not reveal enough information, the pipeline can be created using the FORCE option and the errors can be fixed later.information_
table.
CREATE PIPELINE <pipeline_name> FORCE AS LOAD DATA <data_source>INTO TABLE <table_name>;
Use SELECT *
to view the pipeline errors.
SELECT * FROM information_schema.pipelines_errors;
| DATABASE_NAME | PIPELINE_NAME | ERROR_UNIX_TIMESTAMP | ERROR_TYPE | ERROR_CODE | ERROR_MESSAGE|
Logging to Show More Information with Failures
Some data sources have debug logging availability which provides additional information along with the errors.
Use the following engine variable and command in the order they appear:
-
SET GLOBAL pipelines_
- this engine variable will set debug logging for pipelines if available.extractor_ debug_ logging=ON -
FLUSH EXTRACTOR POOLS
- Flush extractor pools to remove any cached extractors.
Larger errors in the information_
table will show more context about the cause of the failure.
Debug Running Pipelines
There are six tables in the information_
database that relate to pipelines:
-
PIPELINES - stores high-level information about any pipelines.
-
PIPELINES_
BATCHES_ SUMMARY - contains high-level information about individual batches as they are loaded into the database. -
PIPELINES_
BATCHES - contains detailed, low-level information about individual batches as they are loaded into a database. -
PIPELINES_
ERRORS - contains detailed information about errors that occurred during extraction, transformation, or loading. Each row represents a single error event. -
PIPELINES_
CURSORS - contains information about a pipeline’s offset ranges. -
PIPELINES_
FILES - stores information about files that have been extracted from a file system-like data source, such as Amazon S3.
Last modified: October 7, 2024