# CREATE INFERRED PIPELINE Infers the schema from the input files and creates a table and pipeline based on the inferred DDL. Use this command to create the table and pipeline. ## Syntax ```sql CREATE INFERRED PIPELINE AS LOAD DATA {input_configuration } [FORMAT [CSV | JSON | AVRO | PARQUET | ICEBERG]] [AS JSON]; ``` ## Remarks * The `input_configuration` specifies configuration for loading files from Apache Kafka, Amazon S3, a local filesystem, Microsoft Azure, HDFS, and Google Cloud Storage. Refer to `CREATE PIPELINE` for more information on configuration specifications. * All options supported by `CREATE PIPELINE` are supported by `CREATE INFERRED PIPELINE`. * CSV, JSON, Avro, Parquet, and Iceberg formats are supported. * The default format is CSV. * `TEXT` and `ENUM` types use `utf8mb4` charset and `utf8mb4_bin` collation by default. * The `AS JSON` keyword is used to produce pipeline and table definitions in JSON format. * Refer to the [Permissions Matrix](https://docs.singlestore.com/cloud/reference/sql-reference/security-management-commands/permissions-matrix.md) for the required permissions. ## Example The following example demonstrates how to use the `CREATE INFERRED PIPELINE` command to infer the schema of a Avro-formatted file in an AWS S3 bucket. This example uses data that conforms to the schema of the `books` table, as shown in the following. ``` {"namespace": "books.avro", "type": "record", "name": "Book", "fields": [ {"name": "id", "type": "int"}, {"name": "name", "type": "string"}, {"name": "num_pages", "type": "int"}, {"name": "rating", "type": "double"}, {"name": "publish_timestamp", "type": "long", "logicalType": "timestamp-micros"} ]} ``` Refer to [Generate an Avro File](https://docs.singlestore.com/cloud/load-data/about-singlestore-pipelines/pipeline-concepts/schema-and-pipeline-inference/#section-idm4572725773489634329890353354.md) for an example of generating an Avro file that conforms to this schema. The following example creates a pipeline named `books_pipe` by inferring the schema from the specified file. This command also creates a table with the same name as the pipeline. The pipeline is automatically started to allow review and adjustment of the pipeline and table definitions as required. ```sql CREATE INFERRED PIPELINE books_pipe AS LOAD DATA S3 's3://data_folder/books.avro' CONFIG '{"region":""}' CREDENTIALS '{ "aws_access_key_id":"", "aws_secret_access_key":"", "aws_session_token":""}' FORMAT AVRO; ``` ```output Created 'books_pipe' table and 'books_pipe' pipeline ``` Run the `SHOW CREATE PIPELINE` command to view the `CREATE PIPELINE` statement for the pipeline created by the `CREATE INFERRED PIPELINE` command. ```sql SHOW CREATE PIPELINE books_pipe; ``` ```output Pipeline,Create Pipeline books_pipe,"CREATE PIPELINE `books_pipe` AS LOAD DATA S3 's3://data-folder/books.avro' CONFIG '{\""region\"":\""us-west-2\""}' CREDENTIALS BATCH_INTERVAL 2500 DISABLE OUT_OF_ORDER OPTIMIZATION DISABLE OFFSETS METADATA GC INTO TABLE `books_pipe` FORMAT AVRO( `books_pipe`.`id` <- `id`, `books_pipe`.`name` <- `name`, `books_pipe`.`num_pages` <- `num_pages`, `books_pipe`.`rating` <- `rating`, `books_pipe`.`publish_date` <- `publish_date`)" ``` Run the `SHOW CREATE TABLE` command to view the `CREATE TABLE` statement for the table created by the C`REATE INFERRED PIPELINE` command. ```sql SHOW CREATE TABLE books_pipe; ``` ```output Table,Create Table books_pipe,"CREATE TABLE `books_pipe` ( `id` int(11) NOT NULL, `name` longtext CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL, `num_pages` int(11) NOT NULL, `rating` double DEFAULT NULL, `publish_date` bigint(20) NOT NULL, SORT KEY `__UNORDERED` (), SHARD KEY () ) AUTOSTATS_CARDINALITY_MODE=INCREMENTAL AUTOSTATS_HISTOGRAM_MODE=CREATE AUTOSTATS_SAMPLING=ON SQL_MODE='STRICT_ALL_TABLES,NO_AUTO_CREATE_USER'" ``` The pipeline and table definitions can be adjusted using `CREATE OR REPLACE PIPELINE` (`CREATE PIPELINE`) and `ALTER TABLE` commands, respectively. Once the pipeline and table definitions are configured, start the pipeline. ```sql START PIPELINE books_pipe FOREGROUND; ``` This command starts a pipeline in the foreground and displays any errors in the client. For pipelines that run continuously, start them in the background by omitting the `FOREGROUND` keyword. Refer to `START PIPELINE` for more information. Check if the data is loaded. ```sql SELECT * FROM books_pipe ORDER BY id; ``` ```output +----+--------------------+-----------+--------+------------------+ | id | name | num_pages | rating | publish_date | +----+--------------------+-----------+--------+------------------+ | 1 | HappyPlace | 400 | 4.9 | 1680721200000000 | | 2 | Legends & Lattes | 304 | 4.9 | 1669665600000000 | | 3 | The Vanishing Half | 352 | 4.9 | 1591124400000000 | +----+--------------------+-----------+--------+------------------+ ``` Refer to [Schema and Pipeline Inference - Examples](https://docs.singlestore.com/cloud/load-data/about-singlestore-pipelines/pipeline-concepts/schema-and-pipeline-inference/#section-idm4656158159921634329807205399.md) for more examples. *** Modified at: February 5, 2026 Source: [/cloud/reference/sql-reference/pipelines-commands/create-inferred-pipeline/](https://docs.singlestore.com/cloud/reference/sql-reference/pipelines-commands/create-inferred-pipeline/) (An index of the documentation is available at /llms.txt)