# Performing Upserts

An upsert inserts a row `r` into a table if the table does not already contain an existing row with the same primary key or unique key as `r`. Otherwise, the upsert updates the existing row.

An upsert can be performed by specifying the `ON DUPLICATE KEY UPDATE` clause of the `INSERT` statement. The following applies for an upsert:

* If a row is to be inserted that would result in a duplicate value in a `PRIMARY KEY` or `UNIQUE KEY`, SingleStore Helios will instead perform an `UPDATE` of the old row.
* An upsert can only be done against a `PRIMARY KEY` or `UNIQUE KEY` column.
* `ON DUPLICATE KEY UPDATE` cannot update unique key columns in columnstore tables. For rowstore tables, `ON DUPLICATE KEY UPDATE` is allowed as long as the updated column isn't part of the table's explicitly defined shard key or primary key if there's no explicitly defined shard key.
* An upsert conflict occurs when there are multiple rows that could potentially be updated, because there are multiple unique keys defined on the table. When such a conflict occurs, only one row is updated. There is no guarantee as to which one.
* The affected row count will be 0 if no existing row is changed, 1 if a new row is inserted, and 2 if an existing row is updated. In Example 1 below, the `ON DUPLICATE KEY UPDATE` clause updates the existing row instead of inserting a new row to avoid duplicate entry in a `PRIMARY KEY` field, and the query result shows that 2 rows were affected.

## Examples

The following examples use the `cust` table, which is defined as a columnstore table as follows:

```sql
CREATE TABLE cust(name VARCHAR(32), id INT(11), orders INT(11), SORT KEY(id), UNIQUE KEY(id) USING HASH, SHARD KEY(id));
```

```sql
DESC cust;

```

```output

+--------+-------------+------+------+---------+-------+
| Field  | Type        | Null | Key  | Default | Extra |
+--------+-------------+------+------+---------+-------+
| name   | varchar(32) | YES  |      | NULL    |       |
| id     | int(11)     | YES  | PRI  | NULL    |       |
| orders | int(11)     | YES  |      | NULL    |       |
+--------+-------------+------+------+---------+-------+
```

Add three rows of sample data for the examples:

```sql
INSERT INTO cust (name, id, orders) VALUES ("Chris",7214,2), ("Adam",3412,5), ("Elen",8301,4);
```

You could also define `cust` as a rowstore table, as follows, and the examples would return the same results.

```sql
CREATE ROWSTORE TABLE cust(name VARCHAR(32), id INT(11) PRIMARY KEY, orders INT(11));
```

## Example 1: Basic Example of ON DUPLICATE KEY UPDATE

The following example demonstrates the use of `INSERT` with `ON DUPLICATE KEY UPDATE`.

```sql
SELECT * FROM cust;

```

```output

+-------+------+--------+
| NAME  | ID   | ORDERS |
+-------+------+--------+
| Chris | 7214 |      2 |
| Elen  | 8301 |      4 |
| Adam  | 3412 |      5 |
+-------+------+--------+
```

```sql
INSERT INTO cust (ID, ORDERS) VALUES (7214, 3)
ON DUPLICATE KEY UPDATE ORDERS=3;

```

```output

Query OK, 2 rows affected
```

```sql
SELECT * FROM cust;

```

```output

+-------+------+--------+
| NAME  | ID   | ORDERS |
+-------+------+--------+
| Chris | 7214 |      3 |
| Elen  | 8301 |      4 |
| Adam  | 3412 |      5 |
+-------+------+--------+
```

## Example 2: Using the VALUES() Function

The `VALUES()` function can be used to refer to column values from the `INSERT` clause of the `INSERT ... ON DUPLICATE KEY UPDATE` statement. For example:

```sql
SELECT * FROM cust;

```

```output

+-------+------+--------+
| NAME  | ID   | ORDERS |
+-------+------+--------+
| Chris | 7214 |      2 |
| Adam  | 3412 |      5 |
| Elen  | 8301 |      4 |
+-------+------+--------+
```

```sql
INSERT INTO cust (ID, ORDERS) VALUES (7214, 4)
  ON DUPLICATE KEY UPDATE ORDERS = VALUES(ORDERS) + ORDERS;

```

```output

Query OK, 2 rows affected
```

```sql
SELECT * FROM cust;

```

```output

+-------+------+--------+
| NAME  | ID   | ORDERS |
+-------+------+--------+
| Chris | 7214 |      6 |
| Adam  | 3412 |      5 |
| Elen  | 8301 |      4 |
+-------+------+--------+
```

Here, the `VALUES()` function refers to the value that would have been inserted, if there were no duplicate-key conflicts, and should refer to a column name of the destination table.

## Example 3: Using SELECT with ON DUPLICATE KEY UPDATE

The following code block demonstrates the use of `SELECT` with `ON DUPLICATE KEY UPDATE`:

```sql
SELECT * FROM cust_new;

```

```output

+------+------+--------+
| NAME | ID   | ORDERS |
+------+------+--------+
| Bill |   21 |      5 |
| Gwen | 7214 |      3 |
| Sam  |   22 |      2 |
+------+------+--------+
```

The `cust_new` table is defined using the same table definition as `cust`, which is shown in the beginning of the examples.

```sql
SELECT * FROM cust;

```

```output

+-------+------+--------+
| NAME  | ID   | ORDERS |
+-------+------+--------+
| Chris | 7214 |      6 |
| Elen  | 8301 |      4 |
| Adam  | 3412 |      5 |
+-------+------+--------+
```

```sql
INSERT INTO cust (NAME, ID, ORDERS) SELECT * FROM cust_new ON DUPLICATE KEY UPDATE NAME = VALUES(NAME), ORDERS =  VALUES (ORDERS);

```

```output

Query OK, 4 rows affected
Records: 3  Duplicates: 1  Warnings: 0
```

```sql
SELECT * FROM cust;

```

```output

+------+------+--------+
| NAME | ID   | ORDERS |
+------+------+--------+
| Bill |   21 |      5 |
| Gwen | 7214 |      3 |
| Elen | 8301 |      4 |
| Sam  |   22 |      2 |
| Adam | 3412 |      5 |
+------+------+--------+
```

Here, the `VALUES()` function refers to the value that would have been inserted, if there were no duplicate-key conflicts, and should refer to a column name of the destination table.

## Example 4: Updating Values through a Two-step Process

The above examples are functionally equivalent to this common two-step operation.

**Note:** While this works, it is much less performant than the single-steps above.

```sql
SELECT * FROM cust;

```

```output

+-------+------+--------+
| NAME  | ID   | ORDERS |
+-------+------+--------+
| Chris | 7214 |      6 |
| Elen  | 8301 |      4 |
| Adam  | 3412 |      5 |
+-------+------+--------+

```

```sql
SELECT * FROM cust_new;

```

```output

+------+------+--------+
| NAME | ID   | ORDERS |
+------+------+--------+
| Bill |   21 |      5 |
| Gwen | 7214 |      3 |
| Sam  |   22 |      2 |
+------+------+--------+

```

The `cust_new` table is defined using the same table definition as `cust`, which is shown in the beginning of the examples.

Copy all the unique values from the `cust_new` to `cust` table.

```sql
INSERT IGNORE INTO cust SELECT * FROM cust_new;

SELECT * FROM cust;

```

```output

+-------+------+--------+
| NAME  | ID   | ORDERS |
+-------+------+--------+
| Bill  |   21 |      5 |
| Chris | 7214 |      6 |
| Elen  | 8301 |      4 |
| Sam   |   22 |      2 |
| Adam  | 3412 |      5 |
+-------+------+--------+

```

Now, perform an update to copy all the non-unique (duplicate) values.

```sql
UPDATE cust
JOIN cust_new ON ( cust_new.ID=cust.ID )
SET cust.NAME = cust_new.NAME,
    cust.ORDERS = cust_new.ORDERS;

SELECT * FROM cust;

```

```output

+------+------+--------+
| NAME | ID   | ORDERS |
+------+------+--------+
| Bill |   21 |      5 |
| Gwen | 7214 |      3 |
| Elen | 8301 |      4 |
| Sam  |   22 |      2 |
| Adam | 3412 |      5 |
+------+------+--------+

```

## Example 5: Using ON DUPLICATE KEY UPDATE with a Pipeline

The following example demonstrates an upsert on a Pipeline.

Assume the directory `/order_files` has one file `orders1.txt`, which contains the following data:

```
Chris,7214,6
Elen,8301,4
Adam,3412,5
Rachel,9125,2
Susan,8301,7
George,3412,9
```

Create a Pipeline with an `ON DUPLICATE KEY UPDATE` clause:

```sql
CREATE PIPELINE p AS LOAD DATA FS '/order_files/orders*.txt' 
INTO TABLE cust FIELDS TERMINATED BY ',' ON DUPLICATE KEY UPDATE orders = 3;
```

Start the Pipeline:

```sql
START PIPELINE p;
```

As the Pipeline ingests the data from `orders1.txt` into the `cust` table, it encounters the fifth and sixth records in the file, which contain the duplicate keys `8301` and `3412`. The second and third records containing those duplicate keys (which have already been imported into `cust`), are updated with `3`, because `orders = 3` is specified in the `ON DUPLICATE KEY UPDATE` clause.

```sql
SELECT * FROM cust ORDER BY name;

```

```output

+--------+------+--------+
| name   | id   | orders |
+--------+------+--------+
| Adam   | 3412 |      3 |
| Chris  | 7214 |      6 |
| Elen   | 8301 |      3 |
| Rachel | 9125 |      2 |
+--------+------+--------+
```

Example 5 works because the new value for `orders` is specified directly (`orders = 3`). If you try to update the value with an expression (such as a value from the input file), you will see an error like this:

```
Feature 'UPDATE value depends on tables other than that being updated.' is not supported by MemSQL.
```

In order to update existing values with a new value specified in the pipeline, use the `VALUES()` clause. So if the orders.txt file looks instead like this:

```
Chris,7214,6
Elen,8301,99
Adam,3412,99
Rachel,9125,2
Susan,8301,99
George,3412,9
```

And we update the ON DUPLICATE KEY UPDATE statement:

```sql
CREATE PIPELINE p2 AS LOAD DATA FS '/order_files/orders*.txt' 
INTO TABLE cust FIELDS TERMINATED BY ',' ON DUPLICATE KEY UPDATE orders = VALUES(orders);

START PIPELINE p2;
```

The updated results:

```
SELECT * FROM cust ORDER BY name;

```

```output

+--------+------+--------+
| name   | id   | orders |
+--------+------+--------+
| Adam   | 3412 |      9 |
| Chris  | 7214 |      6 |
| Elen   | 8301 |     99 |
| Rachel | 9125 |      2 |
+--------+------+--------+
```

## Example 6: Using `ON DUPLICATE KEY UPDATE` with a Multi-Column Key

This example uses the `product` table, which is defined as follows, with the multi-column unique key `(id_1,id_2)`.

```sql
CREATE TABLE product(name VARCHAR(32), id_1 INT(11), id_2 INT(11), quantity INT(11), SORT KEY (id_1,id_2), UNIQUE KEY(id_1,id_2) USING HASH, SHARD KEY(id_1,id_2));
```

Assume the table contains the following data:

```
+-------------+------+------+----------+
| name        | id_1 | id_2 | quantity |
+-------------+------+------+----------+
| blue pen    | 2792 |    3 |      200 |
| red pen     | 2792 |    5 |      250 |
| white paper | 4624 |    1 |      100 |
| brown paper | 4624 |    8 |      175 |
+-------------+------+------+----------+
```

Attempt to insert a record with the quantity `325`. Because a record containing the key `(2792,5)` is already in the table, the existing record is updated with `325`:

```sql
INSERT INTO product(name,id_1,id_2,quantity) VALUES ('red pen',2792,5,325) ON DUPLICATE KEY UPDATE quantity = 325;
```

Attempt to insert a record with the quantity `125`. Because there is no existing record containing the key `(4624,7)`, the insert succeeds.

```sql
INSERT INTO product(name,id_1,id_2,quantity) VALUES ('yellow paper',4624,7,125) ON DUPLICATE KEY UPDATE quantity = 125;

```

```output

+--------------+------+------+----------+
| name         | id_1 | id_2 | quantity |
+--------------+------+------+----------+
| blue pen     | 2792 |    3 |      200 |
| red pen      | 2792 |    5 |      325 |
| white paper  | 4624 |    1 |      100 |
| yellow paper | 4624 |    7 |      125 |
| brown paper  | 4624 |    8 |      175 |
+--------------+------+------+----------+
```

## Example 7: Using `ON DUPLICATE KEY UPDATE` with JSON data from Kafka

The following example demonstrates the use of `ON DUPLICATE KEY UPDATE` with input data in the JSON format from Kafka.

Define a table where the data will be inserted:

```sql
CREATE TABLE users (
    user_id INT PRIMARY KEY,
    user_name VARCHAR(100),
    user_email VARCHAR(100)
);
```

Sample JSON  data input from Kafka can be:

```
{
    "user_id": 1,
    "user_name": "John Doe",
    "user_email": "johndoe@example.com"
}
```

Create a pipeline to ingest this data from Kafka:

```sql
CREATE PIPELINE user_pipeline AS
  LOAD DATA KAFKA 'kafka_server/my-topic'
  INTO TABLE users
  FORMAT JSON 
    (user_id <- user_id,
     user_name <- user_name,
     user_email <- user_email)
  ON DUPLICATE KEY UPDATE
     user_name = VALUES(user_name), 
     user_email = VALUES(user_email);
```

Run the pipeline USER\_PIPELINE:

```sql
START PIPELINE user_pipeline;
```

View the contents of the table USERS:

```
SELECT * FROM users;

```

```output

+----------+-----------+--------------------+
| user_id  | User_name | user_email         |
+----------+-----------+--------------------+
|    1     | John Doe. | johndoe@example.com|
+----------+-----------+--------------------+ 
```

***

Modified at: December 16, 2025

Source: [/cloud/query-data/advanced-query-topics/performing-upserts/](https://docs.singlestore.com/cloud/query-data/advanced-query-topics/performing-upserts/)

(An index of the documentation is available at /llms.txt)
