SingleStore XL Ingest

Overview

SingleStore XL Ingest ("XL Ingest"), a component of SingleStore Flow, is companion software to SingleStore Ingest ("Ingest"). When large tables (greater than 10GB) are included in Ingest, a full extract may not be feasible due to long processing times. This increases the likelihood of encountering issues, and if a problem occurs, the entire initial extract must be rerun. Therefore, XL Ingest is essential for working with large tables.

XL Ingest handles the initial transfer of large tables by dividing them into smaller logical partitions. It then transfers multiple partitions from the source to the target in parallel. This ensures the transfer happens within a reasonable amount of time. Use XL Ingest to transfer the identified logical partitions of large tables in parallel to SingleStore for storage and processing.

To transfer large tables from source database to SingleStore, perform the steps outlined in Select Tables. Complete these steps before triggering an XL Ingest job to prevent data loss during the transition.

Note: Pausing updates on the source database is not required during this process. Both Ingest and XL Ingest can operate independently and concurrently without disrupting the source data.

Installation

For details on how to install XL Ingest and other Flow components, refer to Install SingleStore Flow.

Select Tables

To ingest large data table(s) from source database to SingleStore, perform the following steps in Ingest before triggering a job in XL Ingest.

  1. Navigate to Dashboard > Tables and select the gear icon.

  2. Define a primary key (Pkey) and any necessary partitions for the table.

  3. Enable Skip Initial Extract to bypass the initial extract and directly proceed with the delta load.

  4. Select Apply to save the changes.

  5. Navigate to Dashboard > Operations and disable the Ingest scheduler to ensure that all tables are moved to the destination at the same time.

  6. Initiate Full Extract to trigger the initial bulk load for all the selected tables, except for those tables marked as Skip Initial Extract.

  7. Initiate Sync New Tables to trigger the initial bulk load for tables marked as Redo Initial Extract and newly-added tables in an ongoing replication. This captures the watermark for CDC and creates the table in the destination database (SingleStore).

  8. Enable the ingest scheduler in Ingest after transferring tables using XL Ingest.

Note: After marking tables with Skip Initial Extract, the next scheduled delta run automatically captures CDC for all tables, including those loaded with XL Ingest. XL Ingest prevents duplication during the CDC load by using a watermark to track changes.

Split Table into Slices

Large tables must be divided into notional slices based on the value of a single slice column, for example, primary key. For automatic slice determination, XL Ingest uses parameters like the number of slices needed and how many characters from the start of the slice column value must be used.

For Large Tables

XL Ingest automatically determines the slices based on the specified parameters. For example, a slice column like names can be divided by the first 3 characters. Alternatively, you can manually enter the slice values instead of using auto-slice.

For Partitioned Tables

Counting records is not necessary as XL Ingest automatically determines the list of partitions to create, with each partition treated as a slice.

For Smaller Tables

Slicing may not be necessary. The entire table can be processed as a single slice.

In this section

Last modified: January 24, 2025

Was this article helpful?

Verification instructions

Note: You must install cosign to verify the authenticity of the SingleStore file.

Use the following steps to verify the authenticity of singlestoredb-server, singlestoredb-toolbox, singlestoredb-studio, and singlestore-client SingleStore files that have been downloaded.

You may perform the following steps on any computer that can run cosign, such as the main deployment host of the cluster.

  1. (Optional) Run the following command to view the associated signature files.

    curl undefined
  2. Download the signature file from the SingleStore release server.

    • Option 1: Click the Download Signature button next to the SingleStore file.

    • Option 2: Copy and paste the following URL into the address bar of your browser and save the signature file.

    • Option 3: Run the following command to download the signature file.

      curl -O undefined
  3. After the signature file has been downloaded, run the following command to verify the authenticity of the SingleStore file.

    echo -n undefined |
    cosign verify-blob --certificate-oidc-issuer https://oidc.eks.us-east-1.amazonaws.com/id/CCDCDBA1379A5596AB5B2E46DCA385BC \
    --certificate-identity https://kubernetes.io/namespaces/freya-production/serviceaccounts/job-worker \
    --bundle undefined \
    --new-bundle-format -
    Verified OK