INFER TABLE
On this page
Creates a DDL definition for a table based on input files and returns a CREATE TABLE statement that can be used to create a table to store the data from the file.CREATE TABLE statement returned in the output of INFER TABLE can be reviewed, edited, and subsequently used to create the required table.
Syntax
INFER TABLE AS LOAD DATA {input_configuration}[FORMAT [CSV | JSON | AVRO | PARQUET | ICEBERG]][AS JSON]
Remarks
-
The
input_specifies configuration for loading files from Apache Kafka, Amazon S3, a local filesystem, Microsoft Azure, HDFS, and Google Cloud Storage.configuration Refer to CREATE PIPELINECREATE PIPELINEfor more information on configuration specifications. -
All options supported by
CREATE PIPELINEare supported byINFER TABLE. -
CSV, JSON, Avro, Parquet, and Iceberg formats are supported.
-
The default format is CSV.
-
TEXTandENUMtypes useutf8mb4charset andutf8mb4_collation by default.bin -
The
AS JSONkeyword is used to produce pipeline and table definitions in JSON format.
Example
The following example demonstrates how to use the INFER TABLE command to infer the schema of a Avro-formatted file in an AWS S3 bucket.
This example uses data that conforms to the schema of the books table, as shown in the following.
{"namespace": "books.avro",
"type": "record",
"name": "Book",
"fields": [
{"name": "id", "type": "int"},
{"name": "name", "type": "string"},
{"name": "num_pages", "type": "int"},
{"name": "rating", "type": "double"},
{"name": "publish_timestamp", "type": "long",
"logicalType": "timestamp-micros"} ]}Refer to Generate an Avro File for an example of generating an Avro file that conforms to this schema.
The following command reads the specified Avro file, infers the table definition, and returns the inferred schema in a CREATE TABLE query definition format:
INFER TABLE AS LOAD DATA S3 's3://data_folder/books.avro'CONFIG '{"region":"<region_name>"}'CREDENTIALS '{"aws_access_key_id":"<your_access_key_id>","aws_secret_access_key":"<your_secret_access_key>","aws_session_token":"<your_session_token>"}'FORMAT AVRO;
"CREATE TABLE `infer_example_table` (
`id` int(11) NOT NULL,
`name` longtext CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`num_pages` int(11) NOT NULL,
`rating` double NULL,
`publish_date` bigint(20) NOT NULL)"Refer to Schema and Pipeline Inference - Examples for more examples.
Last modified: November 14, 2025