CREATE TABLE with an AUTO_INCREMENT Column

AUTO_INCREMENT can be used to automatically generate a unique value for new rows. When you insert a new row, and the AUTO_INCREMENT field is DEFAULT, NULL, or 0, SingleStore automatically assigns a value. This value cannot be set to 0. It’s important to understand that AUTO_INCREMENT only guarantees that automatically-generated values are unique. In general, it does not guarantee that they:

are consecutive or sequential
are monotonically increasing
start from any particular value
are distinct from explicitly-set values

If you explicitly set a value in an INSERT or UPDATE statement, it may collide with past or future automatically-generated values. For example, in the following example, a value is added explicitly to the AUTO_INCREMENT column. This will break the table unless AGGREGATOR SYNC AUTO_INCREMENT is run by the end user (which will reset the auto-increment counter to a value higher than 10). In this case, a duplicate key will not be generated when the AUTO_INCREMENT value reaches 10 eventually on its own as additional rows are added to the table.

SQL

CREATE TABLE ct_auto (c1 INT AUTO_INCREMENT PRIMARY KEY);
INSERT INTO ct_auto (c1) VALUES (10);

A table can have only one AUTO_INCREMENT column. The AUTO_INCREMENT column must be included in an index (not necessarily a PRIMARY or UNIQUE key, a regular key is also allowed).

If syntax such as CREATE TABLE table_1 SELECT * FROM table_2 is used to create table_1 where table_2 has an AUTO_INCREMENT column, it will be created as a non-auto-increment column in table_1.

See LAST_INSERT_ID for more information on AUTO_INCREMENT behavior.

If the AUTO_INCREMENT behavior described here does not satisfy your requirements you can create your own sequence generator using LAST_INSERT_ID. See the sequence generator stored procedure example.

Warning

Restarting an aggregator, such as during upgrades or host machine maintenance, will introduce a large gap between any AUTO_INCREMENT values inserted before the restart and any values inserted after. In the case of reference tables, this same behavior might also occur when a child aggregator is promoted to master aggregator. Depending on how often you restart your aggregators, you could see many jumps in values from a specific aggregator.

These jumps are because each aggregator defines and manages its own range of values to start incrementing from to prevent collisions in a table. With each restart, a new batch of values is used. For sharded tables, the range of AUTO_INCREMENT values increases to the next 1,000,000 after each restart (e.g. 2,430,403 before restart -> 3,000,000 after). For reference tables, the batch size jumps to the next 1,000.

AUTO_INCREMENT in Sharded Tables

On a sharded (distributed) table, AUTO_INCREMENT can only be used on a BIGINT column (as they usually use the entire 64 bits). Each aggregator computes and tracks its own AUTO_INCREMENT values and uses those values when new rows are added to a table. The values on each aggregator are usually, but not always, sequential; therefore, inserts on an individual aggregator generate values which are unique and usually sequential. And because each aggregator manages its own AUTO_INCREMENT values, the automatically-generated values from inserts across multiple aggregators are only unique, never sequential.

Note

If an AUTO_INCREMENT column in a sharded table is declared with a smaller integer type, it is implicitly converted to BIGINT.

Here is an example to illustrate how AUTO_INCREMENT values are generated across aggregators in a cluster as new rows are inserted into table ct_tb:

SQL

SELECT * FROM ct_tb ORDER BY b;

+-------------------+------+------------+
| a                 | b    | c          |
+-------------------+------+------------+
|                 1 |    1 | from MA    |
|                 2 |    2 | from MA    |
|                 3 |    3 | from MA    |
|                 4 |    4 | from MA    |
|                 5 |    5 | from MA    |
| 13510798882111489 |    6 | from CA 96 |
| 13510798882111490 |    7 | from CA 96 |
| 13510798882111491 |    8 | from CA 96 |
| 13510798882111492 |    9 | from CA 96 |
| 13510798882111493 |   10 | from CA 96 |
| 14636698788954113 |   11 | from CA 20 |
| 14636698788954114 |   12 | from CA 20 |
| 14636698788954115 |   13 | from CA 20 |
| 14636698788954116 |   14 | from CA 20 |
| 14636698788954117 |   15 | from CA 20 |
|                 6 |   16 | from MA    |
| 15762598695796737 |   17 | from CA 17 |
| 13510798882111494 |   18 | from CA 96 |
|                 7 |   19 | from MA    |
| 14636698788954118 |   20 | from CA 20 |
+-------------------+------+------------+

As shown in the example above, automatically-generated AUTO_INCREMENT values can differ depending on which aggregator you run the inserts on. Of course, if you ran some inserts on one aggregator and some inserts on another aggregator, you would get different automatically generated values. Also note that automatically-generated values and explicitly-set values can collide in sharded tables.

AUTO_INCREMENT in Reference Tables

The AUTO_INCREMENT value for a reference table is tracked by the master aggregator. It is guaranteed that the next AUTO_INCREMENT value will always be greater than any value previously seen in this column. These generated values are usually sequential, but not always. Contrarily to the behavior for sharded tables, explicitly setting a value in an INSERT or UPDATE statement will not create a collision with future automatically generated values.

The next example shows some queries using AUTO_INCREMENT fields on reference tables.

SQL

CREATE REFERENCE TABLE ct_ref_1(id INT AUTO_INCREMENT PRIMARY KEY);
INSERT INTO ct_ref_1 VALUES();
INSERT INTO ct_ref_1 VALUES(5);
INSERT INTO ct_ref_1 VALUES();

SQL

SELECT id FROM ct_ref_1 ORDER BY id;

+----+
| id |
+----+
|  1 |
|  5 |
|  6 |
+----+

SQL

UPDATE ct_ref_1 SET id = 9 WHERE id = 5;
INSERT INTO ct_ref_1();

SQL

SELECT id FROM ct_ref_1 ORDER BY id;

+----+
| id |
+----+
|  1 |
|  6 |
|  9 |
| 10 |
+----+

SQL

DELETE FROM ct_ref_1;
INSERT INTO ct_ref_1 VALUES();

SQL

SELECT id FROM ct_ref_1 ORDER BY id;

+----+
| id |
+----+
| 11 |
+----+

Setting AUTO_INCREMENT Starting Values

It is possible to override the starting AUTO_INCREMENT value for reference tables by setting the AUTO_INCREMENT option on a CREATE TABLE statement. The starting value cannot be set to 0.

The following example shows how to set the AUTO_INCREMENT start value during table creation:

SQL

CREATE REFERENCE TABLE ct_ref_2 (id int AUTO_INCREMENT PRIMARY KEY) AUTO_INCREMENT = 7;
INSERT INTO ct_ref_2 VALUES (), ();

SQL

SELECT * FROM ct_ref_2;

+----+
| id |
+----+
|  7 |
|  8 |
+----+

This syntax has no effect on sharded tables. It will not return an error, for compatibility with external tools, but it will explicitly present a warning and no operation will be done.

AUTO_INCREMENT During Replication

When replicating data between clusters, the secondary cluster has all the replicated AUTO_INCREMENT values from the primary cluster. When you failover to a secondary cluster, SingleStore synchronizes the secondary cluster by looking for the maximum value in the range of AUTO_INCREMENT values on every aggregator.

SQL

CREATE TABLE IF NOT EXISTS my_MemSQL_table (id INT PRIMARY KEY AUTO_INCREMENT, v VARCHAR(10) NOT NULL);

AUTO_INCREMENT Data Type Requirements in Client Applications

When reading AUTO_INCREMENT values into client applications, ensure that the application-side data type can store the entire range of values generated by an AUTO_INCREMENT column. For example, in sharded tables, AUTO_INCREMENT can only be used on a BIGINT column and generated identifiers typically use the full 64-bit range. If an AUTO_INCREMENT column in a sharded table is declared with a smaller integer type, it is implicitly converted to BIGINT. Values that are truncated or rounded when read into a smaller or floating-point type may fail to correctly match the originally generated AUTO_INCREMENT value and can consequently lead to the following:

duplicate key errors when they are written back to the database, or
client-side overflow errors or other loss of information in the client application.

If your application uses ORMs, frameworks, or runtime libraries that abstract underlying SQL types, verify that the mapped type used for AUTO_INCREMENT identifiers can hold the generated values.

Warning

Although AUTO_INCREMENT values may initially fall in the range of a smaller integer type, they can eventually grow to require the full 64-bit range. Therefore, applications must use a data type that supports 64-bit integer values to prevent overflow-related errors in production environments.

Do not use the following types in client applications to store auto-increment values:

32-bit integer types
Double-precision floating-points types
number type in JavaScript or TypeScript

The following table specifies the recommended type for common programming languages:

Language	Recommended Data Type
JavaScript	`bigint` (the `BigInt` primitive type)
Python	`int`
C/C++	`int64_t` (from `<stdint.h>`)
C#/.NET	`long` (`System.Int64`)
Go	`int64`
Rust	`i64`
Java	`long`
Other languages	Signed integer type with at least 64 bits of precision.

If the application code or framework does not support 64-bit identifiers or if smaller identifier types are preferred, use the following:

Generate identifiers in the application logic: Generate primary key values within the application using a scheme compatible with the application’s preferred identifier type. In this approach, AUTO_INCREMENT is not used for the primary key, and the application is responsible for ensuring uniqueness, handling concurrency, and preventing identifier conflicts.

SingleStore recommends using 64-bit integer types for AUTO_INCREMENT columns and in client applications to avoid precision loss, identifier range conflicts, and duplicate key errors.

AUTO_INCREMENT Best Practices

AUTO_INCREMENT On Sharded Tables

On sharded tables, define the AUTO_INCREMENT column as BIGINT. Each aggregator maintains and allocates values independently; therefore, expect large, non-sequential values and substantial gaps, especially following either of the following operations:

Syncing auto-increment values after a manual insert or update
Node restart

Note

AUTO_INCREMENT gaps are expected on sharded tables and do not indicate data loss.

Ensure AUTO_INCREMENT Data Type Requirements in Client Applications are met.

Manual Inserts, Updates, or Deletes

Avoid manual inserts or updates to the AUTO_INCREMENT column. If values are manually inserted or updated in the AUTO_INCREMENT column, run AGGREGATOR SYNC AUTO_INCREMENT to sync the auto-increment values across all the aggregators and reset the counter to a value greater than the current maximum.

SQL

AGGREGATOR SYNC AUTO_INCREMENT [ON <database>[.<table>]] ALL;

This command is not required for delete operations because SingleStore does not reuse auto-increment values by default. Do not manually reuse auto-increment values after a delete operation. Gaps are expected and reusing values may lead to duplicate key errors.

AUTO_INCREMENT On Reference Tables

AUTO_INCREMENT values on reference tables are always strictly increasing. Following any manual inserts or updates to the AUTO_INCREMENT column in a reference table, subsequent automatically generated values are always greater than any existing value in the AUTO_INCREMENT column. Therefore, manually syncing the auto-increment values is not required on reference tables.

Schema and Data Type Design

Use AUTO_INCREMENT columns as surrogate keys, and enforce business semantics in a separate column or a separate unique key. Use a separate column for the following:
- Encoding semantic dimensions (e.g., tenant, region, type)
- Multi-tenant or domain-scoped IDs
Use BIGINT (or BIGINT UNSIGNED) as the AUTO_INCREMENT column data type for high-volume tables.

Semantics and Application Behavior

Because AUTO_INCREMENT values can have gaps due to rollbacks, failed inserts, etc., the sequence isn't contiguous. Design your application code to accept and accommodate this.
Avoid using AUTO_INCREMENT columns for ordering, especially with multi‑threaded ingestion or multi‑region architectures. If “most recent” or “first” semantics are required, order by a DATETIME[(6)] column or another dedicated ordering column, not by the AUTO_INCREMENT value.

Capacity Planning and Monitoring Usage

For high-volume ingest/update tables, plan AUTO_INCREMENT capacity based on expected ingest rates and identifier consumption over time.

Monitor how quickly the values are consumed and ensure there is sufficient capacity in the identifier space. Use a larger data type if necessary. Optionally, create alerts (e.g. with a scheduled notebook or other scheduled job) for defined thresholds to plan schema or application changes, for example, 70% of maximum supported value of the column data type.