Stage

Stage is a storage service that helps you organize and manage local files for ingestion into your SingleStore Helios database(s). Each workspace group has a Stage where you can create folders and upload files. You can manage files and folders in a Stage using the Cloud Portal or the Management API. Through Stage, you can also save query results into files.

Note

The workspace group must be running SingleStore version 8.1 or later.

Upload a File

To upload a file in a Stage:

  1. Select <your_workspace_group> -> Stages -> Upload New File.

  2. In the Upload File dialog, either drag and drop a file to the dialog or select Browse File.

  3. Once the file is loaded, select Upload File.

Create a Folder

  1. Select <your_workspace_group> -> Stages.

  2. Select the dropdown next to the Upload New File button, and select Create New Folder from the list.

  3. In the Create New Folder dialog, enter a name for the folder, and select Create New Folder.

Ingest a File using Stage

Files can be ingested into a database from a Stage using the Cloud Portal or a pipeline.

Using the Cloud Portal

  1. Under Stages, select the three dots in the Actions column of the file to upload, and then select Load To Database.

  2. In the Load Data dialog, from the Choose Workspace list, select a workspace.

  3. From the Choose a database list, select a database.

  4. In the Table box, select an existing table or enter a new table name.

  5. Select the Generate Notebook button. A notebook is created, which shows the breakdown of all the queries that may be loaded with the notebook.

    You may edit the queries in the notebook to include different column names, column types, etc.

  6. Select Run > Run All Cells.

  7. Run the Check that the data has loaded cell to verify the loaded data.

Using Pipelines

Create a table with a structure that can store the data from the file. Use the following CREATE PIPELINE syntax to load a file from a Stage:

CREATE PIPELINE <pipeline_name>
AS LOAD DATA STAGE <path_in_Stage/filename> { <pipeline_options> }
INTO TABLE <table_name>
{ <data_format_options> }

Once the table and pipeline are created, start the pipeline. Refer to CREATE PIPELINE for the complete syntax and related information.

Here's a sample CREATE PIPELINE statement that loads data from a CSV file:

CREATE PIPELINE dbTest.plTest
AS LOAD DATA STAGE 'data.csv'
BATCH_INTERVAL 2500
SKIP DUPLICATE KEY ERRORS
INTO TABLE t1
FIELDS TERMINATED BY ',' ENCLOSED BY '"' ESCAPED BY '\\'
LINES TERMINATED BY '\n' STARTING BY ''
FORMAT CSV;

Export SQL Results to a Stage

SQL results may be exported to a Stage as follows:

SELECT * FROM <table_name> GROUP BY 1 INTO STAGE '<table_results.csv>';

Use the GROUP BY 1 clause to avoid getting multiple files from each leaf node.

Supported Files

The Stage storage service supports the following file formats:

CSV

SQL

JSON

Parquet

GZ

Zstd

Snappy

Storage Limits

Each Stage can have up to 10GB of storage for free. Individual files must not exceed 5GB in size.

Last modified: February 1, 2024

Was this article helpful?