Connecting StreamSets to SingleStore via Fast Loader

StreamSets can be connected to SingleStore via Fast Loader by creating different types of pipelines. This document provides the connection details for the following pipelines:

  • Hadoop

  • Kafka

  • Filesystem

Perform the following steps first, then follow the specific instructions for your respective pipeline below.

  1. Open the following URL: http://< IP Address of the server running the StreamSets service >:18630/

  2. Enter the username and password to log in. The default is admin/admin

  3. On the Get Started page, click + Create New Pipeline to create a new Pipeline.

  4. Add a title and description for the new pipeline and click Save.

Pipeline from Hadoop Source to SingleStore Using the Fast Loader

  1. Under Select Origin on the right hand side panel, select Hadoop FS Standalone.

  2. Provide the Origin Name and select Send to error for the On Record Error field.

  3. On the connections tab enter the Hadoop File system URI in the following format: hdfs://<IP Address of the server running the Hadoop system>:9000/

  4. Click on the Fields tab and provide the Hadoop file system details.

  5. In the Data Source tab, provide the data source details.

  6. From the right pane select SingleStore Fast Loader as the destination type and configure as below:

    General

    • Name: Name of destination

    • On Record Error: Sent to error

    JDBC

    • JDBC Connection string

    • Schema Name and Table Name

    NOTE: Do the Field to column mapping for all the columns present in the target table.

  7. Connect Hadoop FS Standalone origin to SingleStore Fast Loader destination.

  8. Start the pipeline and data processing will start.

You are now ready to move the data.

Pipeline from Kafka Source to SingleStore Using the Fast Loader

  1. Under Select Origin on the right-hand side panel, select Kafka Consumer as the origin type and configure as below:

    General

    • Name: Name of Origin

    • On Record Error: Sent to error

    Kafka

    • Broker URI: ip-< IP address of the machine running Kafka >:9092

    • Zookeeper URI: ip-< IP address of the machine running Kafka >:2181

    • Consumer Group: < Name of the consumer group >

    • Topic: < Name of the Kafka topic >

    Data Format:

    • Data format: Delimited

    • Delimiter Format Type: Default CSV (ignore empty lines)

    • Header line: No Header Line

  2. From the right pane select SingleStore Fast Loader as the destination type and configure as below:

    General

    • Name: Name of destination

    • On Record Error: Sent to error

    JDBC

    • JDBC Connection string

    • Schema Name and Table Name

    • Field to Column Mapping: < Do mapping of all columns where data is going to load >

    • Default Operation: Insert

    Credentials:

    • Username: < DB user >

    • Password: < DB password >

    NOTE: Do the Field to column mapping for all the columns present in the target table.

  3. Connect Kafka Consumer Origin to SingleStore Fast Loader.

  4. Start the pipeline by clicking on the option present above the left pane.

You are now ready to move the data.

Pipeline from Filesystem Source to SingleStore Using the Fast Loader

  1. Under Select Origin on the right-hand side panel, select Directory as the origin type and configure as below:

    General

    • Name: Name of Origin

    • On Record Error: Sent to error

    Files

    • File Directory: < Directory path where file exists >

    • File Name Pattern: *.csv

    Data Format:

    • Data format: Delimited

    • Delimiter Format Type: Default CSV (ignore empty lines)

    • Header line: With Header Line

  2. From the right pane select SingleStore Fast Loader as the destination type and configure as below:

    General

    • Name: Name of destination

    • On Record Error: Sent to error

    JDBC

    • JDBC Connection string

    • Schema Name and Table Name

    • Field to Column Mapping: < Do mapping of all columns where data is going to load >

    • Default Operation: Insert

    Credentials:

    • Username: < DB user >

    • Password: < DB password >

    NOTE: Do the Field to column mapping for all the columns present in the target table.

  3. Connect Directory Origin to SingleStore Fast Loader.

  4. Start the pipeline by clicking on the option present above the left pane.

You are now ready to move the data.

Last modified: June 22, 2022

Was this article helpful?

Verification instructions

Note: You must install cosign to verify the authenticity of the SingleStore file.

Use the following steps to verify the authenticity of singlestoredb-server, singlestoredb-toolbox, singlestoredb-studio, and singlestore-client SingleStore files that have been downloaded.

You may perform the following steps on any computer that can run cosign, such as the main deployment host of the cluster.

  1. (Optional) Run the following command to view the associated signature files.

    curl undefined
  2. Download the signature file from the SingleStore release server.

    • Option 1: Click the Download Signature button next to the SingleStore file.

    • Option 2: Copy and paste the following URL into the address bar of your browser and save the signature file.

    • Option 3: Run the following command to download the signature file.

      curl -O undefined
  3. After the signature file has been downloaded, run the following command to verify the authenticity of the SingleStore file.

    echo -n undefined |
    cosign verify-blob --certificate-oidc-issuer https://oidc.eks.us-east-1.amazonaws.com/id/CCDCDBA1379A5596AB5B2E46DCA385BC \
    --certificate-identity https://kubernetes.io/namespaces/freya-production/serviceaccounts/job-worker \
    --bundle undefined \
    --new-bundle-format -
    Verified OK