Warning
SingleStore 9.0 gives you the opportunity to preview, evaluate, and provide feedback on new and upcoming features prior to their general availability. In the interim, SingleStore 8.9 is recommended for production workloads, which can later be upgraded to SingleStore 9.0.
Multi-Value Hash Index (JSON)
On this page
Multi-Value Hash index optimizes lookups within JSON objects and arrays.
For example, if an orders table has a JSON column details with the following data:
{"product_id":[{"id":1001,"type":"flour"},{"id":1002,"type":"yeast"}]}
An index created on the details column at the path ["product_
To index values at the top level in an array, specify an empty path array, such as, "PATH":[]
.[1,2,3]
, you can accelerate searches for values within the array.
The multi-value hash index stores pairs of values along with the list of row IDs where that value occurs.
This index is fundamentally a hash index, leveraging the same high-performance technology as the regular columnar hash indexes.
Syntax
-- Adding the multi-value index to a new table --CREATE TABLE <tablename>(<col> JSON NULL,MULTI VALUE INDEX(col) INDEX_OPTIONS='<options>';-- Adding the multi-value index an to existing table column --ALTER TABLE <tableName>ADD MULTI VALUE INDEX (col) INDEX_OPTIONS='<options>';
where <options>
is a JSON object to specify the tokenizer and other metadata related to it.
'{"TOKENIZER":"MATCH_ANY", "PATH":[<path_array>]}'
where <path_
is a comma-separated list of values that specify the path to the field that needs to be indexed."PATH":[]
, is specified, then the top-level elements in the array are indexed (such as [1,2,3]).
Using the Multi-Value Hash Index
The JSON_
built-in function is used to compare values inside a JSON array and to extract values from a JSON object.null
) within a JSON document.
Note
Currently, multi-value hash Index for JSON supports matching only via the MATCH_
expression.
You can also match scalar values (such as strings or doubles) inside the JSON object.
-
Using the
TO_
built-in function.JSON('<str>') -
Manually specifying a quoted constant string literal, e.
g. , .
.. . = '"address"'
Note: Typecasting to JSON using :>JSON
does not automatically wrap an unquoted string into a valid JSON string.2.
.
Supported Query Shapes
JSON_MATCH_ANY(MATCH_PARAM_JSON() = <value>, <key_path>)
JSON_MATCH_ANY(MATCH_PARAM_JSON() <=> <value>, <key_path>)
JSON_MATCH_ANY(MATCH_PARAM_JSON() IN (<value1>, <value2>..), '<key_path>')
<value>
can be a constant literal (such as a string, integer, float), a typecast operator (:>
, !:>
), a user-defined variable (UDV), a deterministic built-in expression (most built-in functions, except for non-deterministic ones like UUID()
or RAND()
), or a combination of these.
In general, the lookup value must remain constant during query execution.
The index is only used for queries that match the exact <key_
defined during index creation.
To check whether the index was used for your query filter, run an EXPLAIN
query - the phrase multi-value hash index in the predicate under ColumnStoreFilter indicates that the index was used.
EXPLAIN SELECT col FROM tWHERE JSON_MATCH_ANY(MATCH_PARAM_JSON() = TO_JSON('type'), col, 'key1');
+--------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
|+-------------------------------------------------------------------------------------------------------------+
| Project [t.col] |
| ColumnStoreFilter [JSON_MATCH_ANY(MATCH_PARAM_JSON() = TO_JSON('type'),t.col,'key1') multi-value hash index] |
| ColumnStoreScan db.t, SORT KEY __UNORDERED () table_type:sharded_columnstore |
|+-------------------------------------------------------------------------------------------------------------+
Remarks
-
Multi-value hash index is case-sensitive for JSON types.
-
Multi-value hash index for JSON activates when matched using the following operators:
=
,<=>
, andIN()
. -
The following table lists the expressions that support Multi-Value Hash Index for each of these operators:
Operator
Supported Expression
=
,<=>
-
Constant literals, for example strings,
NULL
,DOUBLE
,INT
, etc. -
Typecast operators
:>
and!:>
, for example,100001:>JSON
. -
User-defined variables (UDVs), for example:
SET @obj = '{"city": "New York"}':>JSON;SELECT a:>JSON FROM tWHERE JSON_MATCH_ANY(MATCH_PARAM_JSON()=@obj, a, 'city'); -
Procedural SQL variables (stored procedure arguments and local variables).
-
Nested constant (deterministic) built-in expressions, for example:
JSON_BUILD_ARRAY(JSON_SET_JSON('{}', 'a', @json), '{"a":4}':>JSON)
-
A combination of all of the above.
IN()
-
Parameterized
IN
lists with:-
Literals of the same types
-
Single argument built-in expressions of the same shape and literal types in the list, for example,
IN(HEX('a'), HEX('b'))
.
-
-
Limitations
-
Only runtime constant expressions are supported for matching (i.
e. UDV, builtins, typecasts, PSQL variables are supported; UDFs and column fields are not supported). -
Using
OR
inside aJSON_
predicate (for example,MATCH_ ANY JSON_
disables index usage.MATCH_ ANY(MATCH_ PARAM_ JSON() = 12 OR MATCH_ PARAM_ JSON() = 1, a, 'a')) Instead, use either an IN
list or multipleJSON_
filters.MATCH_ ANY -
Only parameterized
IN
list is supported. -
Indexing multiple key paths for the same JSON field requires separate indexes, for example, to index both a::k1 and a::k2, you must create two indexes.
-
This index is supported only for columnstore tables.
-
Multi-value index for JSON does not support the following:
-
References to other unindexed fields on the right-hand side of an expression.
-
Non-deterministic built-in functions, for example
RAND()
. -
JSON_
predicate withMATCH_ ANY MATCH_
option.ELEMENTS -
!=
and other comparison operators (excluding=
and<=>
). -
NOT IN
lists.
-
Examples
Example 1- Searching users
by city
Consider the following table definition:
CREATE TABLE users (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
created DATETIME(6) DEFAULT NOW(6),
data JSON
);
User information is stored in the data column as JSON, structured as follows:
** row-1
{
"id": 1,
"name": "Alice",
"addresses": [
{ "street": "123 Main St", "city": "New York", "zip": "10001" },
{ "street": "456 Oak Ave", "city": "Los Angeles", "zip": "90001" }
]
}
** row-2
{
"id": 2,
"name": "Bob",
"addresses": [
{ "street": "789 Pine Ln", "city": "Chicago", "zip": "60601" }
]
}
** row-3
{
"id": 3,
"name": "Charlie",
"addresses": [
{ "street": "101 Elm St", "city": "New York", "zip": "10002" },
{ "street": "222 Maple Dr", "city": "Houston", "zip": "77001" }
]
}
Adding a multi-value Hash Index to the path "addresses, city" can significantly improve lookup performance.
ALTER TABLE users ADD MULTI VALUE INDEX (data)
INDEX_OPTIONS='{"TOKENIZER":"MATCH ANY","PATH":["addresses", "city"]}';
Find all users
who live in Houston using the JSON_
built-in function.
SELECT id, data::name FROM usersWHERE JSON_MATCH_ANY(MATCH_PARAM_JSON()='"Houston"', data, 'addresses', 'city');
+----+------------+
| id | data::name |
+----+------------+
| 3 | "Charlie" |
+----+------------+
To verify that the multi-value hash index was used, run the EXPLAIN
query as:
EXPLAIN SELECT id, data::name FROM usersWHERE JSON_MATCH_ANY(MATCH_PARAM_JSON()='"Houston"', data, 'addresses', 'city');
+-----------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+-----------------------------------------------------------------------------------------------------------------------------+
| Gather partitions:all alias:remote_0 parallelism_level:segment |
| Project [users.id, JSON_EXTRACT_JSON(users.data,'name') AS `data::name`] |
| ColumnStoreFilter [JSON_MATCH_ANY(MATCH_PARAM_JSON() = '\"Houston\"',users.data,'addresses','city') multi-value hash index] |
| ColumnStoreScan db1.users, SORT KEY __UNORDERED () table_type:sharded_columnstore |
+-----------------------------------------------------------------------------------------------------------------------------+
Example 2 - Multiple indexes
In this example, two separate multi-value hash indexes are used to optimize common lookup patterns on a JSON column for the following two fields:
-
An integer field eventId
-
A string array tags
These indexes help speed up queries that filter by eventID and check for specific tags without scanning the entire table.
-
Create a table.
CREATE TABLE events (id BIGINT PRIMARY KEY,created DATETIME(6),payload JSON); -
Add multi-value hash indexes via
ALTER TABLE
.ALTER TABLE eventsADD MULTI VALUE INDEX idx_event_id (payload)INDEX_OPTIONS = '{"TOKENIZER": "MATCH ANY", "PATH": ["eventId"]}';ALTER TABLE eventsADD MULTI VALUE INDEX idx_tags (payload)INDEX_OPTIONS = '{"TOKENIZER": "MATCH ANY", "PATH": ["tags"]}'; -
Insert data into the table.
INSERT INTO events VALUES(1, NOW(6), JSON_BUILD_OBJECT('eventId', 101, 'tags', JSON_BUILD_ARRAY('sports', 'outdoor', 'fun'))),(2, NOW(6), JSON_BUILD_OBJECT('eventId', 202, 'tags', JSON_BUILD_ARRAY('music', 'concert'))),(3, NOW(6), JSON_BUILD_OBJECT('eventId', 101, 'tags', JSON_BUILD_ARRAY('sports', 'indoor'))); -
Query using both the indexed fields:
SELECT *FROM eventsWHERE JSON_MATCH_ANY(MATCH_PARAM_JSON() = 101:>JSON,payload, 'eventId')AND JSON_MATCH_ANY(MATCH_PARAM_JSON() = TO_JSON('sports'),payload, 'tags');+----+----------------------------+---------------------------------------------------+ |id | created | payload | +----+----------------------------+---------------------------------------------------+ | 1 | 2025-04-29 15:36:52.797432 | {"eventId":101,"tags":["sports","outdoor","fun"]} | | 3 | 2025-04-29 15:36:52.797432 | {"eventId":101,"tags":["sports","indoor"]} | +----+----------------------------+---------------------------------------------------+
-
Verify index usage from the query
EXPLAIN
:EXPLAIN SELECT *FROM eventsWHERE JSON_MATCH_ANY(MATCH_PARAM_JSON() = 101:>JSON,payload, 'eventId')AND JSON_MATCH_ANY(MATCH_PARAM_JSON() = TO_JSON('sports'),payload, 'tags');+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | EXPLAIN | +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Gather partitions:all alias:remote_0 parallelism_level:segment | | Project [events.id, events.created, events.payload] | | ColumnStoreFilter [JSON_MATCH_ANY(MATCH_PARAM_JSON() = (101:>JSON COLLATE utf8mb4_bin NULL),events.payload,'eventId') multi-value hash index AND JSON_MATCH_ANY(MATCH_PARAM_JSON() = TO_JSON('sports'),events.payload,'tags') multi-value hash index] | | ColumnStoreScan db1.events, SORT KEY __UNORDERED () table_type:sharded_columnstore | +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Last modified: June 13, 2025