Pipeline Built-in Functions

SingleStore Helios provides two built-in functions with pipelines to help load data. These functions can be used with the CREATE PIPELINE statement in the SET clause.

pipeline_source_file()

Pipelines persist the name of a file by using the pipeline_source_file() function. Use this function in the SET clause to set a table column to the name of the pipeline data source file.

For example, given the table definition CREATE TABLE b(isbn NUMERIC(13), title VARCHAR(50));, use the following statement to set the titles of files while ingesting data from AWS S3.

CREATE PIPELINE books AS
LOAD DATA S3 's3://<bucket_name>/Books/'
CONFIG '{"region":"us-west-2"}'
CREDENTIALS '{"aws_access_key_id": "<access_key_id>",                              
"aws_secret_access_key": "<secret_access_key>"}'
SKIP DUPLICATE KEY ERRORS
INTO TABLE b
(isbn)
SET title = pipeline_source_file();

For more information on using the pipeline_source_file() function to load data from AWS S3, refer to Load Data from Amazon Web Services (AWS) S3.

pipeline_batch_id()

Pipelines persist the ID of the batch used to load data with the pipeline_batch_id() built-in function. Use this function in the SET clause to set a table column to the ID of the batch used to load the data.

For example, given the table definition CREATE TABLE t(b_id INT, column_2 TEXT);, use this statement to load the batch ID into the b_id column:

CREATE PIPELINE p AS LOAD DATA ... INTO TABLE t(@b_id,column_2) ...
SET b_id = pipeline_batch_id();

Last modified: November 27, 2024

Was this article helpful?