# INFER PIPELINE

Creates a DDL definition for a pipeline and a target table based on input files. Returns a `CREATE PIPELINE` statement that can be reviewed, edited, and subsequently used to create the required pipeline. Use this command to view the inferred DDL.

## Syntax

```sql
INFER PIPELINE AS LOAD DATA {input_configuration}
     [FORMAT [CSV | JSON | AVRO | PARQUET | ICEBERG]]
     [AS JSON]

```

## Remarks

* The `input_configuration` specifies configuration for loading files from Apache Kafka, Amazon S3, a local filesystem, Microsoft Azure, HDFS, and Google Cloud Storage. Refer to `CREATE PIPELINE` for more information on configuration specifications.
* All options supported by `CREATE PIPELINE` are supported by `INFER PIPELINE`.
* CSV, JSON, Avro, Parquet , and Iceberg formats are supported.
* The default format is CSV.
* `TEXT` and `ENUM` types use `utf8mb4` charset and `utf8mb4_bin` collation by default.
* The `AS JSON` keyword is used to produce pipeline and table definitions in JSON format.

> **📝 Note**: If the encoding of the source CSV file is not `utf8mb4`, multi-byte characters in the source file may be replaced with their corresponding single byte counterparts in the inferred table. This results in incorrect header inference for the inferred table.To change the encoding of the source CSV file to `utf8mb4` on a linux machine, run the following commands:1) Determine the current encoding of the CSV file.
>    ```
>    file -i input.csv
>    ```
>
> 2) Convert the file data into `utf8mb4` encoded data.
>    ```
>    iconv -f <input-encoding> -t UTF-8 input.csv -o output.csv
>    ```Run the `INFER PIPELINE` query on the `output.csv` file to get the correct inference.

## Example

The following example demonstrates how to use the `INFER PIPELINE` command to infer the schema of a Avro-formatted file in an AWS S3 bucket.

This example uses data that conforms to the schema of the `books` table, as shown in the following.

```
{"namespace": "books.avro",
"type": "record",
"name": "Book",
"fields": [
{"name": "id", "type": "int"},
{"name": "name", "type": "string"},
{"name": "num_pages", "type": "int"},
{"name": "rating", "type": "double"},
{"name": "publish_timestamp", "type": "long",
"logicalType": "timestamp-micros"} ]}

```

Refer to [Generate an Avro File](https://docs.singlestore.com/db/v9.1/load-data/about-singlestore-pipelines/pipeline-concepts/schema-and-pipeline-inference/#section-idm4572725773489634329890353354.md) for an example of generating an Avro file that conforms to this schema.

The following example generates a table and pipeline definition by scanning the specified Avro file and inferring the schema from selected rows. The output is displayed in query definition format.

```sql
INFER PIPELINE AS LOAD DATA S3
        's3://data_folder/books.avro'
CONFIG '{"region":"<region_name>"}'
CREDENTIALS '{
    "aws_access_key_id":"<your_access_key_id>",
    "aws_secret_access_key":"<your_secret_access_key>",
    "aws_session_token":"<your_session_token>"}'
FORMAT AVRO;

```

```output

"CREATE TABLE `infer_example_table` (
    `id` int(11) NOT NULL,
    `name` longtext CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
    `num_pages` int(11) NOT NULL,
    `rating` double NULL,
    `publish_date` bigint(20) NOT NULL);
CREATE PIPELINE `infer_example_pipeline`
AS LOAD DATA S3 's3://data-folder/books.avro'
CONFIG '{\""region\"":\""us-west-2\""}'
CREDENTIALS '{\n    \""aws_access_key_id\"":\""your_access_key_id\"",
\n    \""aws_secret_access_key\"":\""your_secret_access_key\"",
\n    \""aws_session_token\"":\""your_session_token\""}'
BATCH_INTERVAL 2500
DISABLE OUT_OF_ORDER OPTIMIZATION
DISABLE OFFSETS METADATA GC
INTO TABLE `infer_example_table`
FORMAT AVRO(
    `id` <- `id`,
    `name` <- `name`,
    `num_pages` <- `num_pages`,
    `rating` <- `rating`,
    `publish_date` <- `publish_date`);"

```

Refer to [Schema and Pipeline Inference - Examples](https://docs.singlestore.com/db/v9.1/load-data/about-singlestore-pipelines/pipeline-concepts/schema-and-pipeline-inference/#section-idm4656158159921634329807205399.md) for more examples.

***

Modified at: November 14, 2025

Source: [/db/v9.1/reference/sql-reference/pipelines-commands/infer-pipeline/](https://docs.singlestore.com/db/v9.1/reference/sql-reference/pipelines-commands/infer-pipeline/)

(An index of the documentation is available at /llms.txt)
