# Pipeline Built-in Functions

SingleStore provides twothree built-in functions with pipelines to help load data. These functions can be used with the [CREATE PIPELINE](https://docs.singlestore.com/db/v9.1/reference/sql-reference/pipelines-commands/create-pipeline.md) statement in the `SET` clause.

## pipeline\_source\_file()

Pipelines persist the name of a file by using the `pipeline_source_file()` function. Use this function in the `SET` clause to set a table column to the name of the pipeline data source file.

For example, given the table definition `CREATE TABLE b(isbn NUMERIC(13), title VARCHAR(50));`, use the following statement to set the titles of files while ingesting data from AWS S3.

```sql
CREATE PIPELINE books AS
LOAD DATA S3 's3://<bucket_name>/Books/'
CONFIG '{"region":"us-west-2"}'
CREDENTIALS '{"aws_access_key_id": "<access_key_id>",                              
             "aws_secret_access_key": "<secret_access_key>"}'
SKIP DUPLICATE KEY ERRORS
INTO TABLE b
(isbn)
SET title = pipeline_source_file();
```

For more information on using the `pipeline_source_file()` function to load data from AWS S3, refer to [Load Data from Amazon Web Services (AWS) S3](https://docs.singlestore.com/db/v9.1/load-data/data-sources/load-data-from-amazon-web-services-aws-s-3/#section-idm4587277093680033616331086317.md).

## pipeline\_batch\_id()

Pipelines persist the ID of the batch used to load data with the `pipeline_batch_id()` built-in function. Use this function in the `SET` clause to set a table column to the ID of the batch used to load the data.

For example, given the table definition `CREATE TABLE t(b_id INT, column_2 TEXT);`, use this statement to load the batch ID into the `b_id` column:

```sql
CREATE PIPELINE p AS LOAD DATA ... INTO TABLE t(@b_id,column_2) ... 
SET b_id = pipeline_batch_id();
```

## pipeline\_source\_metadata()

Pipelines persist metadata about the source file from which each row is ingested using the `pipeline_source_metadata()` function. Use this function in the `SET` clause to populate table columns. For each ingested row, this function returns the value of the specified metadata property associated with the source file.

The following are the supported metadata properties for each supported source:

| **Source** | **Supported Metadata Properties**                                                                                                                                                             |
| ---------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| S3         | `size`,`last_modified_timestamp`,`entity_tag`,`owner`,`storage_class`,`file_name`                                                                                                             |
| GCS        | `size`,`last_modified_timestamp`,`entity_tag`,`owner`,`storage_class`,`file_name`                                                                                                             |
| FS         | `size`,`last_modified_timestamp`,`file_name`,`is_directory`,`file_mode`                                                                                                                       |
| Azure      | `size`,`last_modified_timestamp`,`entity_tag`,`file_type`,`file_encoding`,`file_language`,`content_disposition`,`cache_control_settings`,`file_name`,`lease_status`,`lease_state`,`blob_type` |

> **📝 Note**: To store metadata values, the corresponding table columns in the `CREATE TABLE` statement must exist and must be of type `TEXT`.

For example, given the table definition,

```
CREATE TABLE t(a TEXT, b TEXT, c TEXT, 
    file_name TEXT, last_modified_timestamp TEXT, 
    size TEXT, owner TEXT, entity_tag TEXT, 
    storage_class TEXT);
```

Use the following statement to load source file metadata into the corresponding columns:

```sql
CREATE PIPELINE pl
AS LOAD DATA S3 '<path>'
CONFIG '<config>'
CREDENTIALS '<credentials>'
INTO TABLE t(a, b, c)
SET
  last_modified_timestamp = pipeline_source_metadata("last_modified_timestamp"),
  file_name               = pipeline_source_metadata("file_name"),
  entity_tag              = pipeline_source_metadata("entity_tag"),
  size                    = pipeline_source_metadata("size"),
  owner                   = pipeline_source_metadata("owner"),
  storage_class           = pipeline_source_metadata("storage_class");
```

After the pipeline runs, the target table includes the ingested data along with the metadata of the source file for each row.

***

Modified at: February 18, 2026

Source: [/db/v9.1/load-data/about-singlestore-pipelines/pipeline-concepts/pipeline-built-in-functions/](https://docs.singlestore.com/db/v9.1/load-data/about-singlestore-pipelines/pipeline-concepts/pipeline-built-in-functions/)

(An index of the documentation is available at /llms.txt)
