# CREATE INFERRED PIPELINE Infers the schema from the input files and creates a table and pipeline based on the inferred DDL. Use this command to create the table and pipeline. `CREATE INFERRED PIPELINE` now supports Kafka Connect data sources, which allow you to use existing Kafka Connect source connectors to stream data from external systems into SingleStore. When used with Kafka Connect, this command automatically creates a table with a predefined static schema structure containing three columns: `topic` (`TEXT`), `id` (`JSON`), and `record` (`JSON`). Refer to Kafka Connect Pipelines for more information. ## Syntax ```sql CREATE INFERRED PIPELINE AS LOAD DATA {input_configuration | kafkaconnect_configuration} [FORMAT [CSV | JSON | AVRO | PARQUET | ICEBERG]] [AS JSON]; ``` ## Remarks * The `input_configuration` specifies configuration for loading files from Apache Kafka, Amazon S3, a local filesystem, Microsoft Azure, HDFS, and Google Cloud Storage. Refer to `CREATE PIPELINE` for more information on configuration specifications. * The `kafkaconnect_configuration` specifies configuration for loading data using Kafka Connect source connectors: ```sql KAFKACONNECT CONFIG CREDENTIALS ``` * All options supported by `CREATE PIPELINE` are supported by `CREATE INFERRED PIPELINE`. * CSV, JSON, Avro, Parquet, and Iceberg formats are supported. Kafka Connect Pipelines require Avro format only. * While the default format is CSV, Kafka Connect Pipelines requires AVRO format. * `TEXT` and `ENUM` types use `utf8mb4` charset and `utf8mb4_bin` collation by default. * The `AS JSON` keyword is used to produce pipeline and table definitions in JSON format. * Refer to the [Permissions Matrix](https://docs.singlestore.com/db/v9.1/reference/sql-reference/security-management-commands/permissions-matrix.md) for the required permissions. ## Example The following example demonstrates how to use the `CREATE INFERRED PIPELINE` command to infer the schema of a Avro-formatted file in an AWS S3 bucket. This example uses data that conforms to the schema of the `books` table, as shown in the following. ``` {"namespace": "books.avro", "type": "record", "name": "Book", "fields": [ {"name": "id", "type": "int"}, {"name": "name", "type": "string"}, {"name": "num_pages", "type": "int"}, {"name": "rating", "type": "double"}, {"name": "publish_timestamp", "type": "long", "logicalType": "timestamp-micros"} ]} ``` Refer to [Generate an Avro File](https://docs.singlestore.com/db/v9.1/load-data/about-singlestore-pipelines/pipeline-concepts/schema-and-pipeline-inference/#section-idm4572725773489634329890353354.md) for an example of generating an Avro file that conforms to this schema. The following example creates a pipeline named `books_pipe` by inferring the schema from the specified file. This command also creates a table with the same name as the pipeline. The pipeline is automatically started to allow review and adjustment of the pipeline and table definitions as required. ```sql CREATE INFERRED PIPELINE books_pipe AS LOAD DATA S3 's3://data_folder/books.avro' CONFIG '{"region":""}' CREDENTIALS '{ "aws_access_key_id":"", "aws_secret_access_key":"", "aws_session_token":""}' FORMAT AVRO; ``` ```output Created 'books_pipe' table and 'books_pipe' pipeline ``` Run the `SHOW CREATE PIPELINE` command to view the `CREATE PIPELINE` statement for the pipeline created by the `CREATE INFERRED PIPELINE` command. ```sql SHOW CREATE PIPELINE books_pipe; ``` ```output Pipeline,Create Pipeline books_pipe,"CREATE PIPELINE `books_pipe` AS LOAD DATA S3 's3://data-folder/books.avro' CONFIG '{\""region\"":\""us-west-2\""}' CREDENTIALS BATCH_INTERVAL 2500 DISABLE OUT_OF_ORDER OPTIMIZATION DISABLE OFFSETS METADATA GC INTO TABLE `books_pipe` FORMAT AVRO( `books_pipe`.`id` <- `id`, `books_pipe`.`name` <- `name`, `books_pipe`.`num_pages` <- `num_pages`, `books_pipe`.`rating` <- `rating`, `books_pipe`.`publish_date` <- `publish_date`)" ``` Run the `SHOW CREATE TABLE` command to view the `CREATE TABLE` statement for the table created by the C`REATE INFERRED PIPELINE` command. ```sql SHOW CREATE TABLE books_pipe; ``` ```output Table,Create Table books_pipe,"CREATE TABLE `books_pipe` ( `id` int(11) NOT NULL, `name` longtext CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL, `num_pages` int(11) NOT NULL, `rating` double DEFAULT NULL, `publish_date` bigint(20) NOT NULL, SORT KEY `__UNORDERED` (), SHARD KEY () ) AUTOSTATS_CARDINALITY_MODE=INCREMENTAL AUTOSTATS_HISTOGRAM_MODE=CREATE AUTOSTATS_SAMPLING=ON SQL_MODE='STRICT_ALL_TABLES,NO_AUTO_CREATE_USER'" ``` The pipeline and table definitions can be adjusted using `CREATE OR REPLACE PIPELINE` (`CREATE PIPELINE`) and `ALTER TABLE` commands, respectively. Once the pipeline and table definitions are configured, start the pipeline. ```sql START PIPELINE books_pipe FOREGROUND; ``` This command starts a pipeline in the foreground and displays any errors in the client. For pipelines that run continuously, start them in the background by omitting the `FOREGROUND` keyword. Refer to `START PIPELINE` for more information. Check if the data is loaded. ```sql SELECT * FROM books_pipe ORDER BY id; ``` ```output +----+--------------------+-----------+--------+------------------+ | id | name | num_pages | rating | publish_date | +----+--------------------+-----------+--------+------------------+ | 1 | HappyPlace | 400 | 4.9 | 1680721200000000 | | 2 | Legends & Lattes | 304 | 4.9 | 1669665600000000 | | 3 | The Vanishing Half | 352 | 4.9 | 1591124400000000 | +----+--------------------+-----------+--------+------------------+ ``` Refer to [Schema and Pipeline Inference - Examples](https://docs.singlestore.com/db/v9.1/load-data/about-singlestore-pipelines/pipeline-concepts/schema-and-pipeline-inference/#section-idm4656158159921634329807205399.md) for more examples. Refer to Example: Amazon Kinesis Pipeline for Kafka Connect Pipelines example. *** Modified at: February 5, 2026 Source: [/db/v9.1/reference/sql-reference/pipelines-commands/create-inferred-pipeline/](https://docs.singlestore.com/db/v9.1/reference/sql-reference/pipelines-commands/create-inferred-pipeline/) (An index of the documentation is available at /llms.txt)