Stage
On this page
Stage is a storage service that helps you organize and manage local files for ingestion into your SingleStore Helios database(s).
Note
The workspace group must be running SingleStore version 8.
Manage a Stage
You can manage files and folders in a Stage using any of the following:
-
ManagementAPI -
Notebooks
-
SingleStore Python Client
Using the Management API
Use the Stage path (/v1/stage endpoint) in the Management API to manage files and folders in a Stage.
For example, the following API call lists all the files and folders in the Stage attached to the workspace group with the specified ID:
curl -X 'GET' \'https://api.singlestore.com/v1/stage/68af2f46-0000-1000-9000-3f6f5365d878/fs/' \-H 'accept: application/json'
Using Notebooks or SingleStore Python Client
The SingleStore Python SDK supports the Stage object, which can be used to manage files and folders in a Stage.
For example, the following code snippet uploads a file named data.
from singlestoredb import manage_workspacesmgr = manage_workspaces('access_key_token_for_the_Management_API')wg = mgr.workspace_groups['examplewsg']wg.stage.upload_file('/filepath/data.csv', '/data.csv')
Ingest a File using Stage
Files can be ingested into a database from a Stage using a pipeline.
Using the LOAD DATA command
Create a table with a structure that can store data from the file.LOAD DATA syntax to load a file from a stage:
LOAD DATA STAGE 'path_in_stage/filename.extension'INTO TABLE <table_name>[FORMAT {JSON | AVRO | CSV}];
Refer to LOAD DATA for a complete syntax and related information.
The following example loads data from a CSV file from a Stage:
LOAD DATA STAGE 'simple.csv'INTO TABLE simple_dataFIELDS TERMINATED BY ','IGNORE 1 LINES;
Note
LOAD DATA STAGE command is not supported in the Shared Edition.
Using Pipelines
Create a table with a structure that can store the data from the file.CREATE PIPELINE syntax to load a file from a Stage:
CREATE PIPELINE <pipeline_name>AS LOAD DATA STAGE <path_in_Stage/filename> { <pipeline_options> }INTO TABLE <table_name>{ <data_format_options> }
Once the table and pipeline are created, start the pipeline.
Here's a sample CREATE PIPELINE statement that loads data from a CSV file:
CREATE PIPELINE dbTest.plTestAS LOAD DATA STAGE 'data.csv'BATCH_INTERVAL 2500SKIP DUPLICATE KEY ERRORSINTO TABLE t1FIELDS TERMINATED BY ',' ENCLOSED BY '"' ESCAPED BY '\\'LINES TERMINATED BY '\n' STARTING BY ''FORMAT CSV;
Export SQL Results to a Stage
SQL results may be exported to a Stage as follows:
SELECT * FROM <table_name> GROUP BY 1 INTO STAGE '<table_results.csv>'FIELDS TERMINATED BY ','LINES TERMINATED BY '\n';
Use the GROUP BY 1 clause to avoid getting multiple files from each leaf node.
Supported Files
The Stage storage service supports the following file formats:
|
CSV |
SQL |
JSON |
|
Parquet |
GZ |
Zstd |
|
Snappy |
Storage Limits
Each Stage can have up to 10GB of storage for free.
Last modified: