Load Data with Pipelines
This part of the guide will show you how to pull the TPC-H data from a public S3 bucket into your SingleStore database using Pipelines.
-
Create the pipelines by copying the following block.
Again, make sure you select all the queries in SQL Editor before clicking Run. use tpch;CREATE OR REPLACE PIPELINE tpch_100_lineitemAS LOAD DATA S3 'memsql-tpch-dataset/sf_100/lineitem/'config '{"region":"us-east-1"}'SKIP DUPLICATE KEY ERRORSINTO TABLE lineitemFIELDS TERMINATED BY '|'LINES TERMINATED BY '|\n';CREATE OR REPLACE PIPELINE tpch_100_customerAS LOAD DATA S3 'memsql-tpch-dataset/sf_100/customer/'config '{"region":"us-east-1"}'SKIP DUPLICATE KEY ERRORSINTO TABLE customerFIELDS TERMINATED BY '|'LINES TERMINATED BY '|\n';CREATE OR REPLACE PIPELINE tpch_100_nationAS LOAD DATA S3 'memsql-tpch-dataset/sf_100/nation/'config '{"region":"us-east-1"}'SKIP DUPLICATE KEY ERRORSINTO TABLE nationFIELDS TERMINATED BY '|'LINES TERMINATED BY '|\n';CREATE OR REPLACE PIPELINE tpch_100_ordersAS LOAD DATA S3 'memsql-tpch-dataset/sf_100/orders/'config '{"region":"us-east-1"}'SKIP DUPLICATE KEY ERRORSINTO TABLE ordersFIELDS TERMINATED BY '|'LINES TERMINATED BY '|\n';CREATE OR REPLACE PIPELINE tpch_100_partAS LOAD DATA S3 'memsql-tpch-dataset/sf_100/part/'config '{"region":"us-east-1"}'SKIP DUPLICATE KEY ERRORSINTO TABLE partFIELDS TERMINATED BY '|'LINES TERMINATED BY '|\n';CREATE OR REPLACE PIPELINE tpch_100_partsuppAS LOAD DATA S3 'memsql-tpch-dataset/sf_100/partsupp/'config '{"region":"us-east-1"}'SKIP DUPLICATE KEY ERRORSINTO TABLE partsuppFIELDS TERMINATED BY '|'LINES TERMINATED BY '|\n';CREATE OR REPLACE PIPELINE tpch_100_regionAS LOAD DATA S3 'memsql-tpch-dataset/sf_100/region/'config '{"region":"us-east-1"}'SKIP DUPLICATE KEY ERRORSINTO TABLE regionFIELDS TERMINATED BY '|'LINES TERMINATED BY '|\n';CREATE OR REPLACE PIPELINE tpch_100_supplierAS LOAD DATA S3 'memsql-tpch-dataset/sf_100/supplier/'config '{"region":"us-east-1"}'SKIP DUPLICATE KEY ERRORSINTO TABLE supplierFIELDS TERMINATED BY '|'LINES TERMINATED BY '|\n'; -
Start the pipelines by running the following queries.
use tpch;START ALL PIPELINES;Once you see Success messages for all the Pipelines created, SingleStore will begin pulling data from the S3 datasource.
Note
The SQL Editor only runs the queries you have selected, so make sure you have them all selected before clicking Run.
Last modified: September 27, 2023