Stage
On this page
Stage is a storage service that helps you organize and manage local files for ingestion into your SingleStore Helios database(s).
Note
The workspace group must be running SingleStore version 8.
Manage a Stage
You can manage files and folders in a Stage using any of the following:
-
Cloud Portal UI
-
Management
API -
Notebooks
-
SingleStore Python Client
Using the Cloud Portal
Upload a File
To upload a file in a Stage:
-
Select Deployments -> <your_
workspace_ group> -> Stage -> Upload File(s). -
In the Upload File(s) dialog, either drag and drop a file to the dialog or select Browse Files.
-
Once the file is loaded, select Upload File.
Create a Folder
-
Select Deployments -> <your_
workspace_ group> -> Stage. -
Select the Create Folder button.
-
In the Create Folder dialog, enter a name for the folder, and select Create Folder.
Using the Management API
Use the Stage
path (/v1/stage
endpoint) in the Management
API to manage files and folders in a Stage.
For example, the following API call lists all the files and folders in the Stage attached to the workspace group with the specified ID:
curl -X 'GET' \'https://api.singlestore.com/v1/stage/68af2f46-0000-1000-9000-3f6f5365d878/fs/' \-H 'accept: application/json'
Using Notebooks or SingleStore Python Client
The SingleStore Python SDK supports the Stage object, which can be used to manage files and folders in a Stage.
For example, the following code snippet uploads a file named data.
from singlestoredb import manage_workspacesmgr = manage_workspaces('access_key_token_for_the_Management_API')wg = mgr.workspace_groups['examplewsg']wg.stage.upload_file('/filepath/data.csv', '/data.csv')
Ingest a File using Stage
Files can be ingested into a database from a Stage using the Cloud Portal or a pipeline.
Using the Cloud Portal
-
Under Stage, select the ellipsis (three dots) in the Actions column of the file to upload, and then select Load To Database.
-
In the Load Data dialog, from the Choose Workspace list, select a workspace.
-
From the Choose a database list, select a database.
-
In the Table box, select an existing table or enter a new table name.
-
Select the Generate Notebook button.
A notebook is created, which shows the breakdown of all the queries that may be loaded with the notebook. You may edit the queries in the notebook to include different column names, column types, etc.
-
Select Run > Run All Cells.
-
Run the Check that the data has loaded cell to verify the loaded data.
Using the LOAD DATA command
Create a table with a structure that can store data from the file.LOAD DATA
syntax to load a file from a stage:
LOAD DATA STAGE 'path_in_stage/filename.extension'INTO TABLE <table_name>[FORMAT {JSON | AVRO | CSV}];
Refer to LOAD DATA for a complete syntax and related information.
The following example loads data from a CSV file from a Stage:
LOAD DATA STAGE 'simple.csv'INTO TABLE simple_dataFIELDS TERMINATED BY ','IGNORE 1 LINES;
Note
LOAD DATA STAGE
command is not supported in the Shared Edition.
Using Pipelines
Create a table with a structure that can store the data from the file.CREATE PIPELINE
syntax to load a file from a Stage:
CREATE PIPELINE <pipeline_name>AS LOAD DATA STAGE <path_in_Stage/filename> { <pipeline_options> }INTO TABLE <table_name>{ <data_format_options> }
Once the table and pipeline are created, start the pipeline.
Here's a sample CREATE PIPELINE
statement that loads data from a CSV file:
CREATE PIPELINE dbTest.plTestAS LOAD DATA STAGE 'data.csv'BATCH_INTERVAL 2500SKIP DUPLICATE KEY ERRORSINTO TABLE t1FIELDS TERMINATED BY ',' ENCLOSED BY '"' ESCAPED BY '\\'LINES TERMINATED BY '\n' STARTING BY ''FORMAT CSV;
Export SQL Results to a Stage
SQL results may be exported to a Stage as follows:
SELECT * FROM <table_name> GROUP BY 1 INTO STAGE '<table_results.csv>';
Use the GROUP BY 1
clause to avoid getting multiple files from each leaf node.
Supported Files
The Stage storage service supports the following file formats:
CSV |
SQL |
JSON |
Parquet |
GZ |
Zstd |
Snappy |
Storage Limits
Each Stage can have up to 10GB of storage for free.
Last modified: November 12, 2024