# Multi-Value Hash Index (BSON)

Multi-value hash index (or "multi-value index") maps values to the row IDs containing the value. Multi-value index is useful when you want to search for elements within composite data structures like arrays, BSON documents, etc.

Multi-value index stores pairs consisting of a value and a list of row IDs in which the value occurs. Each value is stored only once, so a multi-value index remains compact even when the same value appears multiple times. The performance of queries that search for specific words in documents benefits if those keywords are indexed using multi-value index.

SingleStore's multi-value index is fundamentally a hash index and is built on the same effective technology as the regular columnar hash indices.

BSON type includes extending `BSON_MATCH_ANY` with equality filters inside to be optimized using multi-value index. The following query shapes benefit from multi-value index creation:

* `BSON_MATCH_ANY(MATCH_PARAM_BSON() = 'value', 'path')`
* `BSON_MATCH_ANY(MATCH_PARAM_BSON() IN ( 'value1', 'value2',  'value3'), 'path')`

## Syntax

## For SQL Queries

Adding multi-value index to new tables:

```sql
CREATE TABLE <tablename> 
  (<col1> BSON NULL, 
   MULTI VALUE INDEX(col1) INDEX_OPTIONS='<options>';
```

Adding multi-value index to existing tables:

```sql
ALTER TABLE <tableName> 
  ADD MULTI VALUE INDEX (col1) INDEX_OPTIONS='<options>';
```

where `'<options>'` is a JSON object to specify the tokenizer and other metadata related to it.

```
'{"tokenizer":"MATCH_ANY", "path":[<path_array>]}'
```

where `path_array` is a comma-separated list of values that specify the path to the field that needs to be indexed. If an empty path array, for example, `"PATH":[]`, is specified, then top-level elements in the array are indexed (such as \[1,2,3]). Given a BSON column and an optional path, the multi-value index extracts all values at that path.

## For MongoDB® Queries

```mongodb
db.<collection>.createIndex({"<path_name>":"multi_value"})
```

where `<path_name>` is the path to a property in the BSON document.

## Remarks

* Currently, multi-value index is supported only on BSON type columns.
* Multi-value index is case-sensitive (for BSON types).
* Multi-value index activates when matched using the following operators: `=`, `<=>`, and `IN()`.

  The following table lists the expressions that support multi-value index for each of these operators:
  | Operator  | Supported Expression                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
  | --------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
  | `=`,`<=>` | <ul> <li>Constant literals, for example strings, <code>NULL</code>, <code>DOUBLE</code>, <code>INT</code>, etc.</li> <li>Typecast operators <code>:></code> and <code>!:></code>, for example, <code>100001:>BSON</code></li> <li>User-defined variables (UDVs), for example: <pre><code class="language-sql">SET @obj = '{"city": "New York"}':>BSON;  SELECT a:>JSON FROM t  WHERE BSON_MATCH_ANY(MATCH_PARAM_BSON()=@obj, a, 'city'); </code></pre> </li> <li>Procedural SQL variables (stored procedure arguments and local variables)</li> <li>Nested constant (<strong>deterministic</strong>) built-in expressions, for example: <pre><code>... BSON_BUILD_ARRAY(BSON_SET_BSON('{}', 'a', @bson), '{"a":4}':>JSON) </code></pre> </li> <li>A combination of all of the above</li> </ul> |
  | `IN()`    | <ul> <li>Hexadecimal literal of type X'...' in non-parameterized <code>IN</code> lists, for example: <pre><code class="language-sql">SELECT ... BSON_MATCH_ANY(MATCH_PARAM_BSON() IN ( X'D204000010', X'A186010010'), product_ids, 'product_type') </code></pre> </li> <li>Parameterized <code>IN</code> lists with: <ul> <li>Literals of the same types</li> <li>Single argument built-in expressions of the same shape and literal types in the list, for example, <code>IN(HEX('a'), HEX('b'))</code></li> </ul></li> </ul>                                                                                                                                                                                                                                                                 |

## Unsupported Features or Expressions

Multi-value index is not activated when used in the following:

* Calls to user-defined functions (UDFs)
* References to other unindexed (not a multi-value index) fields on the right-hand side of the expression
* Non-deterministic built-in functions, for example `RAND()`
* Aggregate and window functions
* `MATCH_PARAM_<type>` expressions other than BSON type on the right-hand side, because it's evaluated at runtime
* `BSON_MATCH_ANY` predicate with `MATCH_ELEMENTS` option
* `!=` and other comparison operators (excluding `=` and `<=>`)
* `NOT IN` lists

## Examples

## Example 1 - Using SQL

Consider the following table named **orders**. Note that the multi-value index is added to the **product\_ids** column:

```sql
CREATE TABLE orders(  
  id BIGINT PRIMARY KEY,   
  created DATETIME(6),  
  product_ids BSON,  
  MULTI VALUE INDEX (product_ids) INDEX_OPTIONS='{"TOKENIZER":"MATCH_ANY","PATH":[]}');

INSERT INTO orders VALUES
  (1, '2025-03-03 12:34:56.000001', '100001':>BSON),
  (2, '2025-03-03 12:34:56.000002', '100002':>BSON),
  (3, '2025-03-03 12:34:56.000003', '100003':>BSON),
  (4, '2025-03-03 12:34:56.000004', '100004':>BSON);
```

Optimize the table:

```sql
OPTIMIZE TABLE orders FULL;
```

Perform the lookup:

```sql
SELECT id, created, product_ids:>JSON AS product_ids
  FROM orders 
  WHERE BSON_MATCH_ANY(MATCH_PARAM_BSON() = 100001:>BSON, product_ids);

```

```output

+----+----------------------------+-------------+
| id | created                    | product_ids |
+----+----------------------------+-------------+
|  1 | 2025-03-03 12:34:56.000001 | 100001      |
+----+----------------------------+-------------+
```

The `product_ids` column is cast to JSON for clarity.

The query execution benefits from the multi-value index, note the `ColumnStoreFilter ... multi-value hash index` in the output:

```sql
EXPLAIN SELECT id, created, product_ids:>JSON AS product_ids
  FROM orders 
  WHERE BSON_MATCH_ANY(MATCH_PARAM_BSON() = 100001:>BSON, product_ids); 

```

```output

+------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                |
+------------------------------------------------------------------------------------------------------------------------+
| Gather partitions:all alias:remote_0 parallelism_level:segment                                                         |
| Project [orders.id, orders.created, (orders.product_ids:>JSON COLLATE utf8mb4_bin NULL) AS product_ids]                |
| ColumnStoreFilter [BSON_MATCH_ANY(MATCH_PARAM_BSON() = (100001:>bson NULL),orders.product_ids) multi-value hash index] |
| ColumnStoreScan dbTest.orders, SORT KEY __UNORDERED () table_type:sharded_columnstore                                  |
+------------------------------------------------------------------------------------------------------------------------+

```

## Example 2 - Using Multi-Value Index at Scale

The following example shows how a large data set can benefit from multi-value index at scale. You add a multi-value index and then query the indexed column or path.&#x20;

Create a table **bookings**, and add multi-value index to the **product\_ids** column:

```sql
CREATE TABLE bookings(  
  id BIGINT PRIMARY KEY,   
  created DATETIME(6),  
  product_ids BSON,  
  MULTI VALUE INDEX (product_ids) INDEX_OPTIONS='{"TOKENIZER":"MATCH_ANY","PATH":[]}'
);

-- Add 1Million rows, each with 5 products sampled from 100000 product ids

DELIMITER //

DO DECLARE    
   arr ARRAY(RECORD(id BIGINT, created DATETIME(6), product_ids BSON)) = 
   CREATE_ARRAY(1000000);    
     product_ids BSON;    
     n BIGINT;
BEGIN    
   FOR i IN 0..999999 LOOP        
       product_ids = CONCAT('[',CONCAT_WS(',', (RAND()*100000):>INT, 
                     (RAND()*100000):>INT, (RAND()*100000):>INT, 
                     (RAND()*100000):>INT, (RAND()*100000):>INT),']');        
                        arr[i] = ROW(i, NOW(), product_ids);    
   END LOOP;    
   n = INSERT_ALL('bookings', arr);
END //

DELIMITER ;
```

Optimize the table:

```sql
OPTIMIZE TABLE bookings FULL;
```

Perform the lookup:

```sql
SELECT id, created, product_ids:>JSON AS product_ids 
  FROM bookings 
  WHERE BSON_MATCH_ANY(MATCH_PARAM_BSON() = 94567:>BSON, product_ids);

```

```output

+--------+----------------------------+---------------------------------+
| id     | created                    | product_ids                     |
+--------+----------------------------+---------------------------------+
| 719886 | 2025-03-03 12:50:49.000000 | [77807,55647,59828,9374,94567]  |
| 500322 | 2025-03-03 12:50:49.000000 | [94567,58653,77465,55399,16804] |
| 207632 | 2025-03-03 12:50:48.000000 | [63421,19400,16959,65091,94567] |
| 943351 | 2025-03-03 12:50:50.000000 | [4616,41546,31047,94567,68472]  |
| 363185 | 2025-03-03 12:50:48.000000 | [50258,88567,99311,94567,2009]  |
| 803463 | 2025-03-03 12:50:49.000000 | [22582,64484,15540,44440,94567] |
| 698599 | 2025-03-03 12:50:49.000000 | [84654,32124,94567,26840,90261] |
| 105530 | 2025-03-03 12:50:47.000000 | [57273,21771,31151,4171,94567]  |
| 781873 | 2025-03-03 12:50:49.000000 | [2862,12649,71063,90341,94567]  |
--- Output truncated ---
+--------+----------------------------+---------------------------------+
```

The  `product_ids` column is cast to JSON for clarity.

The query execution benefits from the multi-value index (the `ColumnStoreFilter ... multi-value hash index` in the output indicates that the query used multi-value index):

```sql
EXPLAIN SELECT id, created, product_ids:>JSON AS product_ids 
  FROM bookings 
  WHERE BSON_MATCH_ANY(MATCH_PARAM_BSON() = 94567:>BSON, product_ids);

```

```output

+-------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                 |
+-------------------------------------------------------------------------------------------------------------------------+
| Gather partitions:all alias:remote_0 parallelism_level:segment                                                          |
| Project [bookings.id, bookings.created, (bookings.product_ids:>JSON COLLATE utf8mb4_bin NULL) AS product_ids]           |
| ColumnStoreFilter [BSON_MATCH_ANY(MATCH_PARAM_BSON() = (94567:>bson NULL),bookings.product_ids) multi-value hash index] |
| ColumnStoreScan dbTest.bookings, SORT KEY __UNORDERED () table_type:sharded_columnstore                                 |
+-------------------------------------------------------------------------------------------------------------------------+

```

***

Modified at: January 29, 2026

Source: [/db/v9.1/reference/sql-reference/bson-functions/multi-value-hash-index-bson/](https://docs.singlestore.com/db/v9.1/reference/sql-reference/bson-functions/multi-value-hash-index-bson/)

(An index of the documentation is available at /llms.txt)
