# Configure Table Slicing

Slicing divides the source table into multiple independent chunks of data that can be transferred to the destination in parallel. Instead of copying the entire table in a single thread, XL Ingest splits the work across multiple threads, with each thread responsible for a portion of the data. When configuring slicing, select a column to slice on, typically the primary key or an indexed column with evenly-distributed values. XL Ingest divides the data into slices based on value ranges in the slice column. Each range becomes a slice that transfers independently.

**Note**: Date and time columns (for example `DATE`, `DATETIME`, `TIMESTAMP`) are not supported as slice columns in XL Ingest.

XL Ingest provides two approaches to slicing: **manual** and **automatic**.

## Manual Slicing

Manual slicing provides full control over how the data is divided by allowing the specification of exact values for slice boundaries as a space-separated list. Choose slice boundaries such that each slice is under 10 GB to prevent long-running slices and ensure consistent forward progress.

## Understanding Slice Boundaries

In manual slicing, `N` boundary values create `N-1` slices. For example, boundaries `100 500 1000` create two slices. The first slice includes rows with slice column values  `>= 100` and `< 500` and the second slice includes rows with values `>= 500` and `< 1000`. To capture all records, set the first boundary less than or equal to the minimum value of the slice column, and the last boundary to a value greater than any slice column value that may occur during extraction.

Required inputs for manual slicing:

* **Slice Column**: The column to slice on.  
* **Slice Boundaries**: Space-separated values that define how to split the data into slices.

To configure and run manual slicing:

1. Navigate to the **Table Setup** tab in XL Ingest and select the table from the list on the left.

2. In the **Table Settings** panel, select a column to use for slicing. 

3. In the **Slices** field, enter space-separated boundary values to define slice ranges.

4. Optionally, enter a **Where** clause to filter rows during extraction. In the **Where** field, enter only the condition, without the `WHERE` keyword.

   For example, enter:
   ```sql
   id > 1000
   ```

5. Click **Save** to apply the configuration.

6. Navigate to the **Table Status** tab, select the table, and click **Sync** to begin extraction.

## Manual Slicing Example

Consider a table named customers with `10,000` records, where `customer_id` ranges from `1` to `10,000`. Select `customer_id` as the slice column. In the **Slices** field, enter:

```
1 2500 5000 7500 99999
```

This configuration creates 4 slices: `>= 1` and `< 2500`, `>= 2500` and `< 5000`, `>= 5000` and `< 7500`, `>= 7500` and `< 99999`.

The final boundary is set to a high value to ensure that all records are covered.

## Automatic Slicing

Automatic slicing analyzes data distribution and calculates optimal boundaries. 

The following inputs are required:

* **Slice Column**: The column to slice on (e.g., `customer_id`, `order_date`, `last_name`).
* **Number of Slices**: The desired number of slices (e.g., 10).
* **Prefix(chars)**: Number of leading digits or characters used to calculate slice boundaries.

To configure and run automatic slicing:

1. Navigate to the **Table Setup** tab in XL Ingest and select the table from the list on the left.

2. In the **Table Settings** panel, select a column to use for slicing. 

3. Click **Auto-Slice** in the **Define Slices** section to open the Auto-Slice dialog.

4. Click **Schedule Count** to retrieve table statistics from the source (count, max, min, size, time).

5. Enter the desired **No. of Slices** (total number of partitions to create).

   **Note**: The actual number of slices created in automatic slicing may vary based on your data distribution.

6. Enter **Prefix (chars)** to control slice boundary granularity. This value determines how many leading characters or digits are used to group values into slices.

7. Optionally, enter a **Where** clause to filter rows during extraction. In the **Where** field, enter only the condition, without the `WHERE` keyword.

   For example, enter:
   ```sql
   id > 1000
   ```

8. Review the calculated values including records per slice, slice size, and time per slice.

9. Click **Schedule Slice&#x20;**&#x74;o generate the slices.

10. Navigate to the **Table Status** tab, select the table, and click **Sync** to begin extraction.

## Automatic Slicing Example

Consider a table named orders with `10,000` records. Select `order_id` as the slice column. Click **Schedule Count** to see statistics. Enter `4` for **No. of Slices** and `2` for **Prefix(chars)**, then click **Schedule Slice**.

The system automatically calculates slices based on data distribution and prefix size.

## Using SQL on the Source to Compute Slice Boundaries

When using manual slicing, you can calculate boundaries with assistance from the source SQL database. You can create boundaries for slices of roughly equal size (an approximation of an equi-depth [histogram](https://en.wikipedia.org/wiki/Histogram), a type of histogram where the bars are of roughly equal length, and the boundaries of each bar vary in width) using SQL with the `NTILE` window function as follows:

Set `number_of_buckets` to the `(size of your data / 10GB ) + 1`  to create slices of size just under `10GB` for large tables.

```sql
WITH RankedData AS (
	SELECT
	    your_column,
	    NTILE(number_of_buckets) OVER (ORDER BY your_column) 
	    AS bucket_number
	FROM your_table
),
BucketBoundaries AS (
	SELECT 
            bucket_number,
            MIN(your_column) AS lower_bound,
            MAX(your_column) AS upper_bound,
            COUNT(*) AS row_count
	FROM
	    RankedData
	GROUP BY
	    bucket_number
)
SELECT
	bb.bucket_number,
	bb.lower_bound,
	bb.upper_bound,
	bb.row_count
FROM
	BucketBoundaries bb
ORDER BY
	bb.bucket_number;

```

**Note**: If your slice column uses `UUID` or other data types that do not support `MIN/MAX` functions directly, cast the column to a string type.

Consider the following table:

```sql
CREATE TABLE orderdetails (
    orderNumber      int(11)        NOT NULL DEFAULT '0',
    productCode      varchar(15)    CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL DEFAULT '',
    quantityOrdered  int(11)        DEFAULT NULL,
    priceEach        decimal(10,2)  DEFAULT NULL,
    orderLineNumber  smallint(6)    DEFAULT NULL,
    PRIMARY KEY (orderNumber, productCode)
);

```

To calculate 10 slices on `orderNumber` column, you can run:

```sql
WITH RankedData AS (
   SELECT
       orderNumber,
       NTILE(10) OVER (ORDER BY orderNumber) AS bucket_number
   FROM
       orderdetails
),
BucketBoundaries AS (
   SELECT
       bucket_number,
       MIN(orderNumber) AS lower_bound,
       MAX(orderNumber) AS upper_bound
   FROM
       RankedData
   GROUP BY
       bucket_number
)
SELECT
   CONCAT(
       GROUP_CONCAT(lower_bound ORDER BY bucket_number SEPARATOR ' '),
       ' ',
       MAX(upper_bound) + 1
   ) AS boundaries
FROM
   BucketBoundaries;
```

**Note**: This example uses MySQL syntax. If using PostgreSQL, replace `GROUP_CONCAT` with `STRING_AGG`. For other databases, refer to your database documentation for the equivalent aggregation function.

This query returns a space-separated list of boundaries:

```
10100 10135 10167 10194 10226 10263 10293 10322 10355 10389 10426
```

The `orderNumber` column in this example has a minimum value of `10100` and a maximum value of `10425`. If new records may be inserted during extraction, configure the last boundary to a higher value to ensure all records are included.

## Sampling to Speed Up Boundary Calculation

For extremely large tables, you can use sampling on the source database table in SQL to speed up the scan. For example, sampling at one-hundredth of a percent on a 50 million row table can produce results with slice boundaries nearly identical to a full scan while reducing scan time to a small fraction of the full scan time.

Do not use sampling to calculate the minimum and maximum values of the slice column, as this can miss the actual minimum and maximum. Instead, calculate these values directly with a separate query or use domain knowledge to determine appropriate bounds for your data range.

Consult your source database documentation for details on sampling. For example, PostgreSQL supports the `TABLESAMPLE` clause for this purpose.

***

Modified at: April 15, 2026

Source: [/db/v9.1/load-data/load-data-with-singlestore-flow-on-helios/singlestore-xl-ingest/configure-table-slicing/](https://docs.singlestore.com/db/v9.1/load-data/load-data-with-singlestore-flow-on-helios/singlestore-xl-ingest/configure-table-slicing/)

(An index of the documentation is available at /llms.txt)
