Load Data from Parquet Files

Parquet formatted files can be loaded from a local file or by using a pipeline. The basic syntax is provided below:

Load Parquet Files from a Local Filesystem

Parquet formatted data stored on the local filesystem can be loaded using a LOAD DATA query. This streamlines the process of loading cloud-stored data into tables. Other LOAD DATA clauses (SET, WHERE, etc.) are supported (but not shown) in the following syntax examples.

LOAD DATA INFILE '<path_to_file/file_name>'
INTO TABLE <table_name>
(val1 <- source1,
val2 <- source2
[ ... ]
) [COMPRESSION { AUTO | NONE | LZ4 | GZIP }]
[ ... ]
FORMAT PARQUET;

Load Parquet Files Using a Pipeline

Parquet formatted data stored in an AWS S3 bucket can be loaded using a LOAD DATA query with a pipeline.

LOAD DATA S3 '<bucket name>'
CONFIG '{"region" : "<region_name>"}' 
CREDENTIALS '{"aws_access_key_id" : "<key_id> ", 
             "aws_secret_access_key": "<access_key>"}' 
INTO TABLE <table_name>
       (`<col_a>` <- %, 
 `<col_b>` <- % DEFAULT NULL , 
  ) FORMAT PARQUET;

Last modified: September 26, 2023

Was this article helpful?