View and Handle Pipeline Errors
Concepts
This topic requires an understanding of pipeline batches, which are explained in The Lifecycle of a Pipeline.
View pipeline errors
The Pipelines provide information about pipeline errors that have occurred. Some useful queries against these tables are provided in this section.
Query the information_schema.PIPELINES_ERRORS
table
You can run the following query to show all errors that have occurred, per database, per pipeline, per batch, and per partition.
SELECT DATABASE_NAME, PIPELINE_NAME, BATCH_ID, PARTITION, BATCH_SOURCE_PARTITION_ID, ERROR_KIND, ERROR_CODE, ERROR_MESSAGE, LOAD_DATA_LINE_NUMBER, LOAD_DATA_LINE FROM information_schema.PIPELINES_ERRORS;
Query files that were skipped
The query in the previous section does not show files that were skipped because they had errors. To return such files that were skipped per database and per pipeline (but not per batch nor per partition), run the following query.
SELECT * FROM information_schema.PIPELINES_FILES WHERE FILE_STATE = 'Skipped';
If you need additional information, such as the database, the partition, the error that was generated and the line of the error file or object that caused the issue, run the following query.
SELECT pe.DATABASE_NAME, pe.PIPELINE_NAME, pe.BATCH_ID, pe.PARTITION, pe.BATCH_SOURCE_PARTITION_ID, pe.ERROR_TYPE, pe.ERROR_KIND, pe.ERROR_CODE, pe.ERROR_MESSAGE, pe.LOAD_DATA_LINE_NUMBER, pe.LOAD_DATA_LINE FROM information_schema.PIPELINES_FILES pf, information_schema.PIPELINES_ERRORS pe WHERE pe.BATCH_SOURCE_PARTITION_ID = pf.FILE_NAME and pf.FILE_STATE = 'Skipped';
Address specific errors
The following table lists errors that can occur when running a pipeline statement, such as CREATE PIPELINE
, and errors that can occur while a pipeline is extracting, shaping, and loading data.
Error | Resolution |
---|---|
You get a syntax error when running | Both |
You receive error | The master aggregator can likely not connect to the pipeline's data source. Check the connection parameters, such as |
| The bucket name is case-sensitive. Verify that the case of the bucket name specified in your |
Error | This error can occur when a pipeline attempts to run a transform. Check the following: 1. Verify that the first line of your transform contains a shebang. This specifies the interpreter (such as Python) to use to execute the script. 2. Is the interpreter (such as Python) installed on all leaves? 3. If the transform was written on a Windows machine, do the newlines use |
| An incorrect path to the transform was likely specified. If the path to the transform is correct, then running |
Error: | This error may occur when the default value (8MB) for the engine variable |
A parsing error occurs in your transform. | To debug your transform, you can run |
Rename a table referenced by a pipeline
When trying to rename a table that is referenced by a pipeline the following error will result:
ERROR 1945 ER_CANNOT_DROP_REFERENCED_BY_PIPELINE: Cannot rename table because it is referenced by pipeline <pipeline_name>
The following sequence demonstrates how to rename a pipeline referenced table:
Save your pipeline settings:
SHOW CREATE PIPELINE <pipeline_name> EXTENDED;
Stop the pipeline:
STOP PIPELINE <pipeline_name>;
Drop the pipeline:
DROP PIPELINE <pipeline_name>;
Change the name of the table:
ALTER TABLE <old_table_name> RENAME <new_table_name>;
Recreate the pipeline with the settings obtain in step 1 and change the table name to reflect the new table name.
Start the pipeline:
START PIPELINE <pipeline_name>;
Pipeline errors that are handled automatically
Typical error handing scenario
In most situations, an error that occurs while a pipeline is running is handled in this way:
If an error occurs while a batch b
is running, then b
will fail and b
's transaction rolls back. Then b
is retried at most pipelines_max_retries_per_batch_partition
times. If all of the retries are unsuccessful and pipelines_stop_on_error
is set to ON
, the pipeline stops. Otherwise, the pipeline continues and processes a new batch nb
,which processes the same files or objects that b
attempted to process, excluding any files or objects that may have caused the error.
The following table lists events, which may or may not cause errors, and how the events are handled.
Event | How the Event is Handled |
---|---|
The pipeline cannot access a file or object. | The typical error handling scenario (mentioned earlier in this topic) applies.
|
The pipeline cannot read a file or object because it is corrupted. | The typical error handling scenario (mentioned earlier in this topic) applies.
After fixing the issue with the corrupted file/object, you can have the pipeline reprocess the file/object by running |
A file or object is removed from the filesystem after the batch has started processing the file/object. | The batch does not fail; the file or object is processed. |
A file is removed from the filesystem (or an object is removed from an object store) after the pipeline registers the file/object in | The typical error handling scenario (mentioned earlier in this topic) applies.
|
The cluster restarts while the batch is being processed. | The typical error handling scenario (mentioned earlier in this topic) applies. Once the cluster is online, |
A leaf node is unavailable before the pipeline starts. | This does not cause the pipeline to fail. The pipeline will not ingest any data to the unavailable leaf node. |
A leaf node fails while the pipeline is running. | The batch fails. The batch is retried as described in the typical error handling scenario; that batch and all future batches no longer attempt to load data to the unavailable leaf node. |
An aggregator fails while the pipeline is running | The batch fails. When the aggregator is available, the batch is retried as described in the typical error handling scenario. |
The pipeline reaches the allocated storage space for errors. | The pipeline pauses. How to address the issue: 1) Increase the value of the |