Load Data with Pipelines

This part of the tutorial shows how to ingest MarTech data from a public AWS S3 bucket into the SingleStore database using pipelines.

Note

The SQL Editor only runs the queries that you select, so ensure you have them all selected before selecting Run.

  1. Run the following SQL commands to create the pipelines:

    USE martech;
    CREATE OR REPLACE PIPELINE cities
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/cities.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE cities;
    CREATE OR REPLACE PIPELINE locations
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/locations.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE locations;
    CREATE OR REPLACE PIPELINE notifications
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/notifications.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE notifications;
    CREATE OR REPLACE PIPELINE offers
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/offers.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE offers;
    CREATE OR REPLACE PIPELINE purchases
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/purchases.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE purchases;
    CREATE OR REPLACE PIPELINE requests
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/requests.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE requests;
    CREATE OR REPLACE PIPELINE segments
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/segments.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE segments;
    CREATE OR REPLACE PIPELINE sessions
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/sessions.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE sessions;
    CREATE OR REPLACE PIPELINE subscriber_segments
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/subscriber_segments.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE subscriber_segments;
    CREATE OR REPLACE PIPELINE subscribers
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/subscribers.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE subscribers;
    CREATE OR REPLACE PIPELINE subscribers_last_notification
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/subscribers_last_notification.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE subscribers_last_notification;
    CREATE OR REPLACE PIPELINE worldcities
    AS LOAD DATA S3 's3://singlestore-docs-example-datasets/martech/worldcities.csv'
    CONFIG '{"region":"us-east-1"}'
    SKIP DUPLICATE KEY ERRORS
    INTO TABLE worldcities;
  2. Run the following SQL commands to start the pipelines:

    USE martech;
    START ALL PIPELINES;

Once the Success message is returned for all the created pipelines, SingleStore starts ingesting the data from the S3 bucket.

Last modified: October 10, 2024

Was this article helpful?

Verification instructions

Note: You must install cosign to verify the authenticity of the SingleStore file.

Use the following steps to verify the authenticity of singlestoredb-server, singlestoredb-toolbox, singlestoredb-studio, and singlestore-client SingleStore files that have been downloaded.

You may perform the following steps on any computer that can run cosign, such as the main deployment host of the cluster.

  1. (Optional) Run the following command to view the associated signature files.

    curl undefined
  2. Download the signature file from the SingleStore release server.

    • Option 1: Click the Download Signature button next to the SingleStore file.

    • Option 2: Copy and paste the following URL into the address bar of your browser and save the signature file.

    • Option 3: Run the following command to download the signature file.

      curl -O undefined
  3. After the signature file has been downloaded, run the following command to verify the authenticity of the SingleStore file.

    echo -n undefined |
    cosign verify-blob --certificate-oidc-issuer https://oidc.eks.us-east-1.amazonaws.com/id/CCDCDBA1379A5596AB5B2E46DCA385BC \
    --certificate-identity https://kubernetes.io/namespaces/freya-production/serviceaccounts/job-worker \
    --bundle undefined \
    --new-bundle-format -
    Verified OK