Configure Table Slicing

Slicing divides the source table into multiple independent chunks of data that can be transferred to the destination in parallel. Instead of copying the entire table in a single thread, XL Ingest splits the work across multiple threads, with each thread responsible for a portion of the data. When configuring slicing, select a column to slice on, typically the primary key or an indexed column with evenly-distributed values. XL Ingest divides the data into slices based on value ranges in the slice column. Each range becomes a slice that transfers independently.

XL Ingest provides two approaches to slicing: manual and automatic.

Manual Slicing

Manual slicing provides full control over how the data is divided by allowing the specification of exact values for slice boundaries as a space-separated list. Choose slice boundaries such that each slice is under 10 GB to prevent long-running slices and ensure consistent forward progress.

Understanding Slice Boundaries

In manual slicing, N boundary values create N-1 slices. For example, boundaries 100 500 1000 create two slices. The first slice includes rows with slice column values  >= 100 and < 500 and the second slice includes rows with values >= 500 and < 1000. To capture all records, set the first boundary less than or equal to the minimum value of the slice column, and the last boundary to a value greater than any slice column value that may occur during extraction.

Required inputs for manual slicing:

  • Slice Column: The column to slice on.  

  • Slice Boundaries: Space-separated values that define how to split the data into slices.

To configure and run manual slicing:

  1. Navigate to the Table Setup tab in XL Ingest and select the table from the list on the left.

  2. In the Table Settings panel, select a column to use for slicing. 

  3. In the Slices field, enter space-separated boundary values to define slice ranges.

  4. Optionally, enter a Where clause to filter rows during extraction. 

  5. Click Save to apply the configuration.

  6. Navigate to the Table Status tab, select the table, and click Sync to begin extraction.

Manual Slicing Example

Consider a table named customers with 10,000 records, where customer_id ranges from 1 to 10,000. Select customer_id as the slice column. In the Slices field, enter:

1 2500 5000 7500 99999

This configuration creates 4 slices: >= 1 and < 2500, >= 2500 and < 5000, >= 5000 and < 7500, >= 7500 and < 99999.

The final boundary is set to a high value to ensure that all records are covered.

Automatic Slicing

Automatic slicing analyzes data distribution and calculates optimal boundaries. 

The following inputs are required:

  • Slice Column: The column to slice on (e.g.customer_idorder_datelast_name).

  • Number of Slices: The desired number of slices (e.g., 10).

  • Prefix(chars): Number of leading digits or characters used to calculate slice boundaries.

To configure and run automatic slicing:

  1. Navigate to the Table Setup tab in XL Ingest and select the table from the list on the left.

  2. In the Table Settings panel, select a column to use for slicing. 

  3. Click Auto-Slice in the Define Slices section to open the Auto-Slice dialog.

  4. Click Schedule Count to retrieve table statistics from the source (count, max, min, size, time).

  5. Enter the desired No. of Slices (total number of partitions to create).

    Note: The actual number of slices created in automatic slicing may vary based on your data distribution.

  6. Enter Prefix (chars) to control slice boundary granularity. This value determines how many leading characters or digits are used to group values into slices.

  7. Review the calculated values including records per slice, slice size, and time per slice.

  8. Click Schedule Slice to generate the slices.

  9. Navigate to the Table Status tab, select the table, and click Sync to begin extraction.

Automatic Slicing Example

Consider a table named orders with 10,000 records. Select order_id as the slice column. Click Schedule Count to see statistics. Enter 4 for No. of Slices and 2 for Prefix(chars), then click Schedule Slice.

The system automatically calculates slices based on data distribution and prefix size.

Using SQL on the Source to Compute Slice Boundaries

When using manual slicing, you can calculate boundaries with assistance from the source SQL database. You can create boundaries for slices of roughly equal size (an approximation of an equi-depth histogram, a type of histogram where the bars are of roughly equal length, and the boundaries of each bar vary in width) using SQL with the NTILE window function as follows:

Set number_of_buckets to the (size of your data / 10GB ) + 1  to create slices of size just under 10GB for large tables.

WITH RankedData AS (
SELECT
your_column,
NTILE(number_of_buckets) OVER (ORDER BY your_column)
AS bucket_number
FROM your_table
),
BucketBoundaries AS (
SELECT
bucket_number,
MIN(your_column) AS lower_bound,
MAX(your_column) AS upper_bound,
COUNT(*) AS row_count
FROM
RankedData
GROUP BY
bucket_number
)
SELECT
bb.bucket_number,
bb.lower_bound,
bb.upper_bound,
bb.row_count
FROM
BucketBoundaries bb
ORDER BY
bb.bucket_number;

Note: If your slice column uses UUID or other data types that do not support MIN/MAX functions directly, cast the column to a string type.

Consider the following table:

CREATE TABLE orderdetails (
orderNumber int(11) NOT NULL DEFAULT '0',
productCode varchar(15) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL DEFAULT '',
quantityOrdered int(11) DEFAULT NULL,
priceEach decimal(10,2) DEFAULT NULL,
orderLineNumber smallint(6) DEFAULT NULL,
PRIMARY KEY (orderNumber, productCode)
);

To calculate 10 slices on orderNumber column, you can run:

WITH RankedData AS (
SELECT
orderNumber,
NTILE(10) OVER (ORDER BY orderNumber) AS bucket_number
FROM
orderdetails
),
BucketBoundaries AS (
SELECT
bucket_number,
MIN(orderNumber) AS lower_bound,
MAX(orderNumber) AS upper_bound
FROM
RankedData
GROUP BY
bucket_number
)
SELECT
CONCAT(
GROUP_CONCAT(lower_bound ORDER BY bucket_number SEPARATOR ' '),
' ',
MAX(upper_bound) + 1
) AS boundaries
FROM
BucketBoundaries;

Note: This example uses MySQL syntax. If using PostgreSQL, replace GROUP_CONCAT with STRING_AGG. For other databases, refer to your database documentation for the equivalent aggregation function.

This query returns a space-separated list of boundaries:

10100 10135 10167 10194 10226 10263 10293 10322 10355 10389 10426

The orderNumber column in this example has a minimum value of 10100 and a maximum value of 10425. If new records may be inserted during extraction, configure the last boundary to a higher value to ensure all records are included.

Sampling to Speed Up Boundary Calculation

For extremely large tables, you can use sampling on the source database table in SQL to speed up the scan. For example, sampling at one-hundredth of a percent on a 50 million row table can produce results with slice boundaries nearly identical to a full scan while reducing scan time to a small fraction of the full scan time.

Do not use sampling to calculate the minimum and maximum values of the slice column, as this can miss the actual minimum and maximum. Instead, calculate these values directly with a separate query or use domain knowledge to determine appropriate bounds for your data range.

Consult your source database documentation for details on sampling. For example, PostgreSQL supports the TABLESAMPLE clause for this purpose.

Last modified: February 12, 2026

Was this article helpful?

Verification instructions

Note: You must install cosign to verify the authenticity of the SingleStore file.

Use the following steps to verify the authenticity of singlestoredb-server, singlestoredb-toolbox, singlestoredb-studio, and singlestore-client SingleStore files that have been downloaded.

You may perform the following steps on any computer that can run cosign, such as the main deployment host of the cluster.

  1. (Optional) Run the following command to view the associated signature files.

    curl undefined
  2. Download the signature file from the SingleStore release server.

    • Option 1: Click the Download Signature button next to the SingleStore file.

    • Option 2: Copy and paste the following URL into the address bar of your browser and save the signature file.

    • Option 3: Run the following command to download the signature file.

      curl -O undefined
  3. After the signature file has been downloaded, run the following command to verify the authenticity of the SingleStore file.

    echo -n undefined |
    cosign verify-blob --certificate-oidc-issuer https://oidc.eks.us-east-1.amazonaws.com/id/CCDCDBA1379A5596AB5B2E46DCA385BC \
    --certificate-identity https://kubernetes.io/namespaces/freya-production/serviceaccounts/job-worker \
    --bundle undefined \
    --new-bundle-format -
    Verified OK

Try Out This Notebook to See What’s Possible in SingleStore

Get access to other groundbreaking datasets and engage with our community for expert advice.