Load Data with Pipelines

This part of the tutorial shows how to ingest MarTech data from a public AWS S3 bucket into the SingleStore database using pipelines.

Note

The SQL Editor only runs the queries that you select, so ensure you have them all selected before selecting Run.

  1. Run the following SQL commands to create the pipelines:

    USE martech;
    CREATE OR REPLACE PIPELINE cities
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/cities.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE cities;
    CREATE OR REPLACE PIPELINE locations
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/locations.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE locations;
    CREATE OR REPLACE PIPELINE notifications
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/notifications.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE notifications;
    CREATE OR REPLACE PIPELINE offers
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/offers.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE offers;
    CREATE OR REPLACE PIPELINE purchases
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/purchases.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE purchases;
    CREATE OR REPLACE PIPELINE requests
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/requests.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE requests;
    CREATE OR REPLACE PIPELINE segments
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/segments.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE segments;
    CREATE OR REPLACE PIPELINE sessions
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/sessions.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE sessions;
    CREATE OR REPLACE PIPELINE subscriber_segments
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/subscriber_segments.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE subscriber_segments;
    CREATE OR REPLACE PIPELINE subscribers
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/subscribers.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE subscribers;
    CREATE OR REPLACE PIPELINE subscribers_last_notification
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/subscribers_last_notification.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE subscribers_last_notification;
    CREATE OR REPLACE PIPELINE worldcities
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/worldcities.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE worldcities;
  2. Run the following SQL commands to start the pipelines:

    USE martech;
    START ALL PIPELINES;

Once the Success message is returned for all the created pipelines, SingleStore starts ingesting the data from the S3 bucket.

Last modified: October 10, 2024

Was this article helpful?