EXTRACT PIPELINE … INTO OUTFILE
On this page
This command takes a sample of the data streaming into your pipeline and copies it into a file on disk.
Syntax
EXTRACT PIPELINE pipe_line[FROM 'source_partition'[OFFSETS start_offset TO end_offset]]INTO OUTFILE 'file_name'
Remarks
-
pipe_
is the configured pipeline.line -
file_
the output file containing your sample data.name -
source_
is a source partition ID.partition -
start_
andoffset end_
can be used to extract the exact range of sample data.offset -
This command causes implicit commits.
Refer to COMMIT for more information. -
Refer to the Permission Matrix for the required permission.
Note
You cannot run EXTRACT PIPELINE
when the pipeline is in a Running
or Error
state.
Return Type
A file containing transform data that can be used during debugging operations.
cat sample_output | python transform.py
Examples
The following saves random sample data.
EXTRACT PIPELINE p INTO OUTFILE 'transform_output';
The following is useful if there is a specific partition or file with a known problem.
EXTRACT PIPELINE p FROM '6' INTO OUTFILE 'transform_output';
The following extracts an exact range of data, which is useful if the problematic data is in a specifically known kafka region.
EXTRACT PIPELINE p FROM '10' OFFSETS 0 TO 6 INTO OUTFILE 'transform_output';
Last modified: April 6, 2023