# Using JSON

## About JSON in SingleStore

SingleStore exposes a Javascript Object Notation (JSON) (An open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays.) column type that implements the [JSON standard](http://http://www.ietf.org/rfc/rfc4627.txt).

You can define columns in SingleStore tables using the [JSON Type](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-types/json-type.md). Analytics on these JSON columns is very efficient as SingleStore automatically columnarizes JSON data. A schema is inferred from JSON keys and the data is split into columns by key path and stored in an encoded [Parquet-like format](https://docs.singlestore.com/db/v9.1/create-a-database/columnstore/columnstore-seekability-using-json/#section-idm413515222379322.md).

The JSON data is stored as if you had created a schema with separate columns for every field. As a result, queries on JSON columns read only the parts of a JSON object that are relevant to a query and therefore have excellent performance.

SingleStore provides a set of JSON functions for extracting, searching, analyzing, and modifying JSON data, including:

* [JSON\_EXTRACT\_\<type>](https://docs.singlestore.com/db/v9.1/reference/sql-reference/json-functions/json-extract-type.md), to extract values out of JSON documents at specified keypaths.
* Shorthand syntax, `::` operators, to extract values out of JSON documents. The `::` operators are convenient aliases for the `JSON_EXTRACT_<type>` functions and follow the same rules.

  * JSON shorthand syntax is a path that uses `::` as a separator.
* [JSON\_MATCH\_ANY](https://docs.singlestore.com/db/v9.1/reference/sql-reference/json-functions/json-match-any.md), to check for the existence of values in a JSON document or array based on a path and a filter.
* [JSON\_TO\_ARRAY](https://docs.singlestore.com/db/v9.1/reference/sql-reference/json-functions/json-to-array.md), to convert a JSON array to a SingleStore array.

  * `JSON_TO_ARRAY` can be used in combination with [TABLE](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/table.md) to operate on values of a JSON array as SQL rows. This functionality is similar to `UNNEST` in other database systems.

JSON columns can be searched using SingleStore's [Java Lucene](https://lucene.apache.org/core/)-compatible [full-text search](https://docs.singlestore.com/db/v9.1/developer-resources/functional-extensions/working-with-full-text-search.md) and indexed with the [Multi-Value Hash Index (JSON)](https://docs.singlestore.com/db/v9.1/create-a-database/multi-value-hash-index-json.md). In addition, JSON columns can be indexed using computed columns; refer to [Indexing Data in JSON Columns](https://docs.singlestore.com/#UUID-2ccafa6b-ead7-016d-adca-79fb8d0bdade.md) for more information.

SingleStore has a native BSON data type, plus [SingleStore Kai](https://docs.singlestore.com/db/v9.1/reference/#singlestore-kai.md), a MongoDB®-compatible API. The BSON data is stored in columns just like JSON data, and can be indexed with the [Multi-Value Hash Index (BSON)](https://docs.singlestore.com/db/v9.1/reference/sql-reference/bson-functions/multi-value-hash-index-bson.md) and computed columns.

An alternative to using the `JSON` type is to map JSON fields to individual columns and use SQL queries to access the JSON data. Refer to [Load JSON Files with LOAD DATA](https://docs.singlestore.com/db/v9.1/load-data/load-data-from-files/load-data-from-json-files/load-json-files-with-load-data.md) for information about how to map `JSON` fields to SingleStore columns during the loading process.

## JSON Data Encodings

The default encoding for `JSON` columns is `SeekableJson`. As described above, this encoding is very efficient as it columnarizes the JSON. A schema is inferred from JSON keys and the data is split into columns by keypath and stored in an encoded Parquet-like format. Refer to [Columnstore Seekability using JSON](https://docs.singlestore.com/db/v9.1/create-a-database/columnstore/columnstore-seekability-using-json.md) for more information.

`JSON` columns may also be encoded with the [string encodings](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-types/encoding-string-data-in-columnstore-tables.md) `SeekableLZ4` and `SeekableStringRunLength`. These encodings compress data, but do not support efficient searches or extraction. SingleStore recommends using `SeekableLZ4` and `SeekableStringRunLength` only for columns on which JSON values will be extracted as a whole.

**Note**: The information on this page is intended for use with `JSON` columns created with `SeekableJson` encoding or its precursor `JSONParquet`. Using functions described on this page on `JSON` columns with a string encoding will result in poor performance.

## Examples

The following table is used in the examples. Note the `printings` array is intended to indicate the number of copies of the book printed in each printing; `printings` data is in millions of books and is not accurate.

```sql
CREATE TABLE books_json (id INT, books JSON);

INSERT INTO books_json VALUES
(1, '{
       "title": "Onyx Storm",
       "author": "Rebecca Yarros",
       "details": {
         "publisher": "Entangled:Red Tower Books", 
         "numpages": 544,
         "publication date": "January 21, 2025",
         "printings": [2,1.3],
         "series": "The Empyrean" 
       }
     }'
),
(2, '{
       "title": "The Maid",
       "author": "Nita Prose",
       "details": {
         "publisher": "Ballantine Books",
         "numpages": 385,
         "publication date": "January 4, 2022",
         "printings": [0.5,0.75,1.2]
       }
     }'
),
(3, '{
       "title": "The Last Letter",
       "author": "Rebecca Yarros",
       "details": {
         "publisher": "Entangled:Amara",
         "numpages": 432,
         "publication date": "February 26, 2019",
         "printings": [0.25,0.5,0.5]
       }
     }'
);

```

## Example 1: Extract Values from JSON Using the `::` Operators

The `::`, `::$`, and `::%` operators can be used to extract fields, strings, and SQL doubles from JSON documents. In the example below:

* `books::title` extracts the `title` field,
* `books::$title` extracts the `title` field as a SQL string,
* `books::details::%numpages` extracts the `numpages` field as a double, and
* `` books::details::printings::%`0` `` extracts the 0th element of the `printings` array.

Backticks (\`) are required around numeric keys, as shown with `` `0` `` above

Refer to [Using the ::$ and ::% Operators](https://docs.singlestore.com/#UUID-36dc6fd4-161c-cbc2-4a87-556e9d33d4dd.md) for details.

```sql
SELECT id,
       books::title AS title,
       books::$title AS title_string,
       books::details::%numpages AS numpages,
       books::details::printings::%`0` AS first_printing
FROM books_json
ORDER BY id;


```

```output

+------+-------------------+-----------------+----------+----------------+
| id   | title             | title_string    | numpages | first_printing |
+------+-------------------+-----------------+----------+----------------+
|    1 | "Onyx Storm"      | Onyx Storm      |      544 |              2 |
|    2 | "The Maid"        | The Maid        |      385 |            0.5 |
|    3 | "The Last Letter" | The Last Letter |      432 |           0.25 |
+------+-------------------+-----------------+----------+----------------+

```

## Example 2: Extract Values from JSON using `JSON_EXTRACT_`

The [JSON\_EXTRACT\_](https://docs.singlestore.com/db/v9.1/reference/sql-reference/json-functions/json-extract-type.md) functions can be used to extract values from a JSON document in addition to the path expression syntax shown in Example 1. The `JSON_EXTRACT_` functions can be used when you want to use variables or expressions in the keypath. The keypath in `JSON_EXTRACT_` functions is a comma-separated list of object keys or zero-indexed array positions.

Below is a query similar to the query in Example 1, expressed using `JSON_EXTRACT_` functions.

```sql
SELECT id,
      JSON_EXTRACT_JSON(books,'title') AS title,
      JSON_EXTRACT_STRING(books,'title') AS title_string,
      JSON_EXTRACT_BIGINT(books,'details','numpages') AS numpages,
      JSON_EXTRACT_DOUBLE(books,'details','printings',1-1) AS first_printing
FROM books_json
ORDER BY id;

```

```output

+------+-------------------+-----------------+----------+----------------+
| id   | title             | title_string    | numpages | first_printing |
+------+-------------------+-----------------+----------+----------------+
|    1 | "Onyx Storm"      | Onyx Storm      |      544 |              2 |
|    2 | "The Maid"        | The Maid        |      385 |            0.5 |
|    3 | "The Last Letter" | The Last Letter |      432 |           0.25 |
+------+-------------------+-----------------+----------+----------------+

```

Note that the expression `1-1` is used to extract the value in position 0 in the `printings` array. Such expressions are supported in `JSON_EXTRACT_` functions, but not when using the `::` operators.

A `JSON_EXTRACT_` on a nested key will only scan the column for that key, not the entire JSON document. For example, `JSON_EXTRACT_JSON(jsondata,'details','title') AS title`, will only scan the column for the `title` key.

## Example 3: Find Existence of Values and Paths using JSON\_MATCH\_ANY

The [JSON\_MATCH\_ANY](https://docs.singlestore.com/db/v9.1/reference/sql-reference/json-functions/json-match-any.md) and [JSON\_MATCH\_ANY\_EXISTS](https://docs.singlestore.com/db/v9.1/reference/sql-reference/json-functions/json-match-any.md) functions can be used to find the existence of values matching predicates and paths in JSON documents.

The `JSON_MATCH_ANY` function returns true if a value exists in the JSON at the filter path for which the filter predicate evaluates to true. The `JSON_MATCH_ANY_EXISTS`  function returns true if there is a value (possibly null) in the JSON at the filter path.

The following is an example of using `JSON_MATCH_ANY` to find books that are part of `The Empyrean` series.

```sql
SELECT id, books::$title AS title, books::details::$series AS series 
FROM books_json
WHERE JSON_MATCH_ANY(books::?details.series, MATCH_PARAM_STRING_STRICT() = "The Empyrean");

```

```output

+------+------------+--------------+
| id   | title      | series       |
+------+------------+--------------+
|    1 | Onyx Storm | The Empyrean |
+------+------------+--------------+

```

## Example 4: Aggregate Elements in a JSON Array Value using REDUCE

The [REDUCE](https://docs.singlestore.com/db/v9.1/reference/sql-reference/json-functions/reduce.md) function can be used to aggregate elements in a JSON array value. SingleStore recommends using the `REDUCE` function when aggregating elements within a JSON array in a row.

Refer to [Example 5](https://docs.singlestore.com/#section-idm234948461588865.md) for an example of aggregating JSON array values across multiple rows which takes advantage of [JSON Array Performance Enhancements](https://docs.singlestore.com/#section-idm323419580718356.md).

The `REDUCE` function applies an expression to each element of an array and returns a single value. In the following example, `REDUCE` is used to sum the elements of the printings array for each row. The result is the total number of copies of each book that have been printed.

```sql
SELECT books::$title,
  REDUCE(
     0 :> double,
     JSON_TO_ARRAY(books_json.books::details::printings),
     REDUCE_ACC() + REDUCE_VALUE()
   ) AS total_printed_by_book
FROM books_json;

```

```output

+-----------------+-----------------------+
| books::$title   | total_printed_by_book |
+-----------------+-----------------------+
| The Maid        |                  2.45 |
| The Last Letter |                  1.25 |
| Onyx Storm      |                   3.3 |
+-----------------+-----------------------+

```

In this example:

1. The `JSON_TO_ARRAY` function converts the array in the JSON field `jsondata::books::details::printings` to a SQL array.

2. The `REDUCE` function takes as input

   1. An initial value: `0:>double`.

   2. A SQL array: the result of `JSON_TO_ARRAY`.

   3. An accumulator expression: `(REDUCE_ACC() + REDUCE_VALUE())`, which specifies that the array elements should be summed.

## Example 5: Use the `::` Operators in UPDATE Statements

You can use the `::` operators for JSON keypaths in [UPDATE](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/update.md) queries. The `::` operators are simply convenient aliases for the [JSON\_SET\_\<type>](https://docs.singlestore.com/db/v9.1/reference/sql-reference/json-functions/json-set-type.md) functions and follow the same rules.

The following query updates the number of pages in the book, `Onyx Storm`.

```sql
UPDATE books_json 
SET books::details::%numpages = 545
WHERE books::$title = "Onyx Storm";

```

To add a printing for the book `Onyx Storm`, use [JSON\_ARRAY\_PUSH\_\<type>](https://docs.singlestore.com/db/v9.1/reference/sql-reference/json-functions/json-array-push-type.md) as follows.

```sql
UPDATE books_json 
SET books::details::printings = JSON_ARRAY_PUSH_DOUBLE(books::details::printings, 1.5)
WHERE books::$title = "Onyx Storm";
```

To add a series to the book `The Maid`.

```sql
UPDATE books_json 
SET books::details::$series = 'Molly the Maid'
WHERE books::$title = "The Maid";

```

View the resulting updates.

```sql
SELECT JSON_PRETTY(books) 
FROM books_json
WHERE books::$title = "The Maid";

```

```output

+---------------------------------------------+
| {
  "author": "Nita Prose",
  "details": {
    "numpages": 385,
    "printings": [
      0.5,
      0.75,
      1.2
    ],
    "publication date": "January 4, 2022",
    "publisher": "Ballantine Books",
    "series": "Molly the Maid"
  },
  "title": "The Maid"
}                       |
+---------------------------------------------+

```

## Example 6: Convert an Array (List) of JSON Objects to a Table

Use the `JSON_TO_ARRAY` function and the `TABLE` built-in function to convert a list of JSON objects to a table.

Create a table with a column to hold a JSON value and insert data into that table.

```sql
CREATE TABLE json_list_example (json_list JSON);

INSERT INTO json_list_example values( 
'[
      {
       "title": "Onyx Storm",
       "author": "Rebecca Yarros",
 	 "numpages": 544
}, 
     {
      "title": "The Last Letter",
      "author": "Rebecca Yarros",
      "numpages": 432
     }
]');
```

In the following query, the syntax `JOIN TABLE(JSON_TO_ARRAY(json_list))` converts the objects in the `json_list` column to a table, that is, each object in the JSON list is turned into a row in the table.

* The `JSON_TO_ARRAY` function converts the JSON array to a SingleStore `ARRAY`.
* The `TABLE` function converts a SingleStore `ARRAY` to a column named `table_col` that contains one row for each array entry.
* The `JOIN` clause is required when using the `TABLE` function on an existing SingleStore table.

```sql
SELECT json_list_as_table.table_col AS books_col
FROM json_list_example JOIN TABLE(JSON_TO_ARRAY(json_list)) AS json_list_as_table;

```

```output

+----------------------------------------------------------------------+
| books_col                                                            |
+----------------------------------------------------------------------+
| {"author":"Rebecca Yarros","numpages":544,"title":"Onyx Storm"}      |
| {"author":"Rebecca Yarros","numpages":432,"title":"The Last Letter"} |
+----------------------------------------------------------------------+
```

The `JSON_AGG` function can be used to combine the rows in the table into a single row and re-create the original JSON array.

```sql
WITH books_table AS (
	SELECT json_list_as_table.table_col AS books_col
            FROM json_list_example JOIN TABLE(JSON_TO_ARRAY(json_list)) AS json_list_as_table
)
SELECT JSON_AGG(books_col) FROM books_table;

```

```output

+----------------------------------------------------------------------------------------------------------------------------------------+
| JSON_AGG(books_col)                                                                                                                    |
+----------------------------------------------------------------------------------------------------------------------------------------+
| [{"author":"Rebecca Yarros","numpages":544,"title":"Onyx Storm"},{"author":"Rebecca Yarros","numpages":432,"title":"The Last Letter"}] |
+----------------------------------------------------------------------------------------------------------------------------------------+
```

## Example 7: Aggregate Elements in JSON Arrays Across Rows using TABLE and JSON\_TO\_ARRAY

Use the [TABLE](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/table.md) and [JSON\_TO\_ARRAY](https://docs.singlestore.com/db/v9.1/reference/sql-reference/json-functions/json-to-array.md) functions to aggregate values from JSON arrays across rows. Aggregations over results of the `JOIN TABLE(JSON_TO_ARRAY())` operation are optimized to reduce the amount of data read. Queries will retrieve only the parts of the JSON column relevant to the query instead of fetching the entire JSON object from disk. Refer to [JSON Array Performance](https://docs.singlestore.com/#section-idm323419580718356.md) for details.

The following example uses the `TABLE` and `JSON_TO_ARRAY` functions to sum the elements in the printings array across all rows for each author.

```sql
SELECT books::$author AS author, SUM(JSON_EXTRACT_DOUBLE(i.table_col)) AS total_printed_by_author
FROM books_json 
JOIN TABLE(
JSON_TO_ARRAY(books_json.books::details::printings)) AS i
GROUP BY author; 

```

```output

+----------------+-------------------------+
| author         | total_printed_by_author |
+----------------+-------------------------+
| Rebecca Yarros |                    4.55 |
| Nita Prose     |                    2.45 |
+----------------+-------------------------+

```

In this example:

1. The `JSON_TO_ARRAY` function converts the array in the JSON field `books_json.books::details::printings` to a SQL array.

2. The `TABLE` function converts this array into a table.

3. The `JOIN` clause `FROM books_json JOIN TABLE(JSON_TO_ARRAY(books_json.books::details::printings)) AS i` creates a table that contains:

   1. For each row in the `books_json` table:

      1. One row for each element in the `printings` array in that row.

      2. This row contains the values from the columns in the `books_json` table and a column, `table_col`, which contains the array value.

4. Finally, the `SELECT` and `GROUP BY` aggregate over the table created in step 3.

In this example, which aggregates over multiple JSON documents, SingleStore only reads the `author` and `printings` values to compute the result. Additional JSON values, such as `title`, `publisher`, and `numpages`, will not be read making the query execution very efficient.

**Note**: The `JSON_EXTRACT_DOUBLE()` function in the `SELECT` clause is necessary to use the `JSON_TO_ARRAY` optimization. If that function is removed, the query will execute, but will not use the optimization.

## Example 8: Optimized Queries on JSON Integer and Boolean Values

When a JSON column is stored using SeekableJSON encoding (the default for JSON columns in columnstore tables), queries that use `JSON_EXTRACT_BIGINT(...)` on integer or boolean key paths operate directly on encoded columnstore values.

As a result, filters, `GROUP BY`, and aggregate functions on those key paths use the same encoded-data optimizations as standard columnstore columns. This approach can significantly improve performance.

This optimization applies automatically when all of the following conditions are met:

* The JSON column uses SeekableJSON encoding.
* The key path contains at most one value (or `NULL`) per row.
* The query accesses the key path using `JSON_EXTRACT_BIGINT(...)`.
* The query does not select the full JSON document from the same column.

This example shows how SingleStore optimizes queries on JSON integer and boolean values stored in columnstore tables. When a JSON column uses the default SeekableJSON encoding, queries that use integer or boolean key paths with `JSON_EXTRACT_BIGINT(...)` can operate directly on encoded values. As a result, filters, `GROUP BY`, and aggregate functions run without decoding the full JSON document.

The following setup creates a columnstore table with JSON data:

```sql
-- Create a columnstore table with JSON data

CREATE TABLE events (
    event_id BIGINT,
    event_data JSON
);

-- Insert sample data

INSERT INTO events VALUES
(1, '{"user_id": 1001, "is_premium": true, "score": 85, "tags": ["new", "active"]}'),
(2, '{"user_id": 1002, "is_premium": false, "score": 92, "tags": ["active"]}'),
(3, '{"user_id": 1003, "is_premium": true, "score": 78, "tags": ["new"]}');

```

The following query filters and aggregates using encoded operations:

```sql
SELECT
    JSON_EXTRACT_BIGINT(event_data, 'is_premium') AS is_premium,
    AVG(JSON_EXTRACT_BIGINT(event_data, 'score')) AS avg_score,
    COUNT(*) AS event_count
FROM events
WHERE JSON_EXTRACT_BIGINT(event_data, 'score') > 80
GROUP BY is_premium;

```

```output

+------------+-----------+-------------+
| is_premium | avg_score | event_count |
+------------+-----------+-------------+
|          0 |      92.0 |           1 |
|          1 |      85.0 |           1 |
+------------+-----------+-------------+

```

In this example:

* The `WHERE` predicate on score operates on encoded integer values.
* `GROUP BY is_premium` and `AVG(score)` also operate on encoded data.
* The engine decodes only rows that satisfy the filter, not the entire JSON document.

The following query demonstrates boolean filtering. Boolean values return as integers (`true = 1`, `false = 0`) and participate in encoded operations:

```sql
SELECT
    JSON_EXTRACT_BIGINT(event_data, 'user_id') AS user_id
FROM events
WHERE JSON_EXTRACT_BIGINT(event_data, 'is_premium') = 1; 

```

```output

+---------+
| user_id |
+---------+
|    1001 |
|    1003 |
+---------+

```

## Managing Collections of Metadata

JSON is useful for managing a collection of diverse data, represented as name-value pairs, that might otherwise be cumbersome to refactor into a formalized key-value table, or that might be stored in a table that is sparsely populated. For example, suppose an organization had an asset management application using SingleStore to track all the information about its physical assets. The asset data is diverse—what’s relevant for a desk differs from what’s important for a server machine or a company car. All assets might have common attributes, such as asset tag ID, asset type, asset name, and description. Each type of asset might have unique attributes, such as size and weight dimensions, hostname and IP address, or gas mileage.

Instead of creating a highly granular table to manage all data as key-value pairs, this organization could simply create a SingleStore table using a JSON column to efficiently manage the unique attribute data. With this design:

* Each asset gets a row in the table.
* Attributes that are common to all assets have their own column in the table. These columns allow you to query on common features and quickly narrow down the final result set as much as possible (for example, filtering by asset type).
* The various, remaining attributes associated with each asset are stored in a JSON column (which might be named something like `property_bag`). For example, the JSON column for an office desk asset could include JSON data such as size, weight, and number of drawers. The JSON data for a server machine could include rack location, number of cores, and MAC address.

## DDL: Defining Tables with JSON Columns

Any SingleStore table can contain one or more columns of data type `JSON`. A JSON column can optionally be suffixed with `NOT NULL`.

## Comparing JSON and LONGTEXT Columns

A JSON column is analogous to a `LONGTEXT` column in the following ways:

* JSON columns can store arbitrarily large JSON values in a normalized text representation.
* JSON columns have the same storage requirement, as if the JSON value were stored in a text column.

The primary difference is that JSON data is stored in a normalized format, which makes many operations faster than if the data were stored manually in a text column. The following is an example of non-normalized data, which is valid JSON but is relatively difficult to parse:

```json
'{ "b\u000a": 1,"a": 2 ,"a":3 } '

```

Normalized data, on the other hand, is easier to parse because duplicate keys are merged, the data is sorted by keys, and extraneous whitespace is removed.

```json
'{"a":3,"b\n":1}'

```

SingleStore recommends storing JSON data in `JSON` columns and not in `LONGTEXT` columns. `JSON` columns validate the JSON values and provide [Unicode Support](https://docs.singlestore.com/#UUID-e72c5528-d96b-6095-ab75-bf514f7f247a.md). If storage space and memory use is a concern, and search and extraction are not required on the column, use a string encoding on the column. Refer to [Columnstore Seekability using JSON](https://docs.singlestore.com/db/v9.1/create-a-database/columnstore/columnstore-seekability-using-json.md) for more information.

## Defining JSON Columns

Defining a JSON column in a SingleStore table is as simple as specifying the JSON data type in the `CREATE TABLE` command:

```sql
CREATE TABLE assets (
   tag_id BIGINT PRIMARY KEY,
   name TEXT NOT NULL,
   description TEXT,
   properties JSON NOT NULL);

```

## JSON Collation

The collation for a JSON column can be set explicitly using a `COLLATE` clause. When the collation is not set explicitly, the collation is controlled by the `json_collation` engine variable.

Refer to [Sync Variables Lists](https://docs.singlestore.com/db/v9.1/reference/configuration-reference/engine-variables/list-of-engine-variables/#sync-variables-lists.md), [Character Set and Collation Override](https://docs.singlestore.com/db/v9.1/reference/sql-reference/character-encoding/character-set-and-collation-override.md), [Unicode Support](https://docs.singlestore.com/#UUID-e72c5528-d96b-6095-ab75-bf514f7f247a.md), and [Character Encodings and Collation](https://docs.singlestore.com/#UUID-00e2ca08-8dc7-bc24-f268-b6bfde552e6a.md) for more information.

## Indexing Data in JSON Columns

JSON columns are not indexed directly - they are indexed using computed columns. For the fastest performance, you should not use JSON built-ins or `::` notation in your filters. Instead, create a computed column that includes the JSON column in the computation, and then use the computed column for the index. In this way, the index gets updated only when the relevant JSON data is updated in a row.

```sql
CREATE TABLE assets (
   tag_id BIGINT PRIMARY KEY,
   name TEXT NOT NULL,
   description TEXT,
   properties JSON NOT NULL,
   weight AS properties::%weight PERSISTED DOUBLE,
   license_plate AS properties::$license_plate PERSISTED LONGTEXT,
   KEY(license_plate), KEY(weight));

```

JSON computed columns that are indexed will be utilized by the optimizer more efficiently. Queries that use indexed computed columns as filters or sorts, will perform faster by avoiding expression evaluation, and by seeking into or searching the indexes rather than scanning tables. In the following examples, we will use the `::` notation in the filtering and sorting clauses to illustrate how the optimizer matches the computed columns.

```sql
EXPLAIN SELECT * FROM assets WHERE properties::$license_plate = "VGB116";

```

```output

+------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                          |
+------------------------------------------------------------------------------------------------------------------+
| Gather partitions:all alias:remote_0                                                                             |
| Project [assets.tag_id, assets.name, assets.description, assets.properties, assets.weight, assets.license_plate] |
| ColumnStoreFilter [assets.license_plate = 'VGB116' index]                                                       |
| ColumnStoreScan test1.assets, KEY __UNORDERED () USING CLUSTERED COLUMNSTORE table_type:sharded_columnstore      |
+------------------------------------------------------------------------------------------------------------------+
4 rows in set (0.00 sec)

```

```sql
EXPLAIN SELECT * FROM assets ORDER BY properties::%weight;

```

```output

+------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                          |
+------------------------------------------------------------------------------------------------------------------+
| GatherMerge [remote_0.weight] partitions:all alias:remote_0                                                      |
| Project [assets.tag_id, assets.name, assets.description, assets.properties, assets.weight, assets.license_plate] |
| Sort [assets.weight]                                                                                             |
| ColumnStoreScan test1.assets, KEY __UNORDERED () USING CLUSTERED COLUMNSTORE table_type:sharded_columnstore      |
+------------------------------------------------------------------------------------------------------------------+
4 rows in set (0.00 sec)
```

## DML: Accessing Data in JSON Columns

This section describes how to insert and update data in a SingleStore table with one or more JSON columns.

## Inserting Data into a JSON Column

When inserting a row in a table, specifying JSON data is straightforward. For example: Given a table `t` defined as `CREATE TABLE test_table(col_a TEXT, col_b JSON);`, you can insert a row into `test_table` as follows:

```sql
INSERT INTO test_table(col_a,col_b) VALUES ('hello','{"x":"goodbye","y":"goodnight"}');

```

## Columnstore Tables Having JSON Columns with Null Values or Empty Arrays

By default, SingleStore preserves columnstore JSON `NULL` values and empty arrays. To change this behavior and disable this setting, set the `preserve_original_colstore_json` global variable to `OFF`. This flag preserves the original columnstore JSON for any new data that is loaded; it does not update the existing data. By default, this variable is set to `AUTO` (same as `ON`).

## Behavior When `preserve_original_colstore_json` is Set to `OFF`

When you store a JSON column in a columnstore table (and `preserve_original_colstore_json` is set to `OFF`), then null values and empty arrays in the JSON object are handled as follows when the object is written to the columnstore:

* Name/value pairs with the value `NULL` are normally removed from the JSON object.
* Name/value pairs containing empty arrays are normally removed from the JSON object.
* If the JSON object has only the value `NULL` or `[]`, the value is replaced with `NULL`.

Example commands you can use to store data are `INSERT`, `UPDATE`, and `LOAD DATA`.

An example `INSERT` scenario follows. Consider a table that is defined as:

```sql
CREATE TABLE json_empty_values_table(a INT, b JSON, SORT KEY (a));

```

Insert five rows into the table:

```sql
INSERT INTO json_empty_values_table VALUES (1, '{"v":null}');
INSERT INTO json_empty_values_table VALUES (2, '{"w":[]}');
INSERT INTO json_empty_values_table VALUES (3, '{"x":"foo","y":null,"z":[]}');
INSERT INTO json_empty_values_table VALUES (4, 'null');
INSERT INTO json_empty_values_table VALUES (5, '[]');

```

Manually flush the inserted data to the columnstore:

```sql
OPTIMIZE TABLE json_empty_values_table FLUSH;

```

Query the table:

```sql
SELECT * FROM json_empty_values_table ORDER BY a;

```

```output

+------+-----------------------------+
| a    | b                           |
+------+-----------------------------+
|    1 | {"v":null}                  |
|    2 | {"w":[]}                    |
|    3 | {"x":"foo","y":null,"z":[]} |
|    4 | null                        |
|    5 | []                          |
+------+-----------------------------+

```

## Accessing Fields in a JSON Object

To access a field of a JSON object stored in a column, use the name of the column suffixed with `::keyName`. For example, if column data contains `{"bits":[true,false]}`, then the expression `data::bits` evaluates to the JSON value `[true,false]`.

## Converting a JSON String Value into a SQL String

The `::` operator, when applied to a string column of a JSON object, returns the value of the string enclosed in quotes. Use the `::$keyname` operator to return the value of the string without the enclosing quotes.

For example, consider a table `TestJSON `that contains a JSON column `data`.

```sql
INSERT INTO TestJSON VALUES ('{"first":"hello"}');
```

Retrieve the value of the `data `column using `::` and `::$` operators:

```sql
SELECT data::first, data::$first FROM TestJSON;

```

```output

+-------------+--------------+
| data::first | data::$first |
+-------------+--------------+
| "hello"     | hello        |
+-------------+--------------+
```

In this example, the `data `JSON column contains two string values.

```sql
INSERT INTO TestJSON VALUES ('{"first":"hello", "second":"world"}');
```

To retrieve the concatenated SQL string, use the `::$` operator:

```sql
SELECT CONCAT(data::$first, ' ', data::$second) FROM TestJSON;

```

```output

+------------------------------------------+
| CONCAT(data::$first, ' ', data::$second) |
+------------------------------------------+
| hello world                              |
+------------------------------------------+
```

## Converting a JSON Number or Boolean Value into a SQL DOUBLE

To transparently convert a JSON number or Boolean value into a SQL `DOUBLE`, use the name of the column suffixed with `::%keyname`. For example, if column data contains `{"valid":true,"value":3.14}`, then `data::%valid` is 1 and `data::%value` is 3.14.

## Using the ::$ and ::% Operators

If either the `::$` or `::%` access operator is used on a JSON field that is not of the appropriate type, the JSON result is quietly converted to a string or a double based on the rules defined in [JSON\_EXTRACT\_\<type>](https://docs.singlestore.com/db/v9.1/reference/sql-reference/json-functions/json-extract-type.md).

The `::` access operators are simply convenient aliases for the `JSON_EXTRACT_<type>` built-in functions, and they follow all the same rules. However, the `::` operators do not work on the outputs of User Defined Functions (UDFs), such as `SELECT udf_name(1)::key`, and expressions. The `JSON_EXTRACT_<type>` functions must be used instead.

In addition, The `::` operators  require that numeric keys be specified with backticks (\`).

The following `SELECT` statement returns the 2nd (zero-indexed) array element of the array `{"a":[1,2,3,4]}`. A syntax error will be returned if the backticks (\`) are not included.

```sql
SELECT json, json::a::`2` 
FROM (SELECT '{"a":[1,2,3,4]}' AS json) sub;

```

```output

+-----------------+--------------+
| json            | json::a::`2` |
+-----------------+--------------+
| {"a":[1,2,3,4]} | 3            |
+-----------------+--------------+

```

In addition, backticks can be used for non-numeric key names and array indexes, which is useful if the key name or index contains a space.

The following query is valid and will return the same output as the query above.

```sql
SELECT json, json::`a`::`2` 
FROM (SELECT '{"a":[1,2,3,4]}' AS json) sub;

```

```output

+-----------------+----------------+
| json            | json::`a`::`2` |
+-----------------+----------------+
| {"a":[1,2,3,4]} | 3              |
+-----------------+----------------+

```

The following is the same logical query except the array index 2 has been replaced by the expression `1+1` and the `JSON_EXTRACT_JSON` function is used.

While this example uses a simple expression `(1+1)`, more functions such as `DAYOFWEEK` and more complex expressions can be used.

Refer to [JSON\_EXTRACT\_\<type>](https://docs.singlestore.com/db/v9.1/reference/sql-reference/json-functions/json-extract-type.md) for details.

```sql
SELECT json, JSON_EXTRACT_JSON(json, "a", 1+1) 
FROM (SELECT '{"a":[1,2,3,4]}' AS json) sub;

```

```output

+-----------------+--------------+
| json            | json::a::`2` |
+-----------------+--------------+
| {"a":[1,2,3,4]} | 3            |
+-----------------+--------------+
```

## Accessing Nested JSON Objects

To access nested JSON objects, chain the colon-colon operator to form a `keypath`. For example, `data::address::street` means the `street` field of the `address` field of the `data` column.

> **📝 Note**: If one of the keys in the keypath is not present in the nested object, then the `entire` colon-colon expression yields SQL `NULL`.

## Working with Nested Arrays in a JSON Column

Consider a table defined as:

```sql
CREATE TABLE json_tab (`id` INT(11) DEFAULT NULL,`jsondata` JSON COLLATE utf8_bin);
```

Insert values as follows:

```sql
INSERT INTO json_tab VALUES 
( 8765 ,' {"city":"SFO","sports_teams":[{"sport_name":"football","teams":  [{"club_name":"Raiders"},{"club_name":"49ers"}]},
{"sport_name":"baseball","teams" : [{"club_name":"As"},{"club_name":"SF Giants"}]}]}') ; 

INSERT INTO json_tab VALUES 
( 9876,'{"city":"NY","sports_teams" : [{ "sport_name":"football","teams" : [{ "club_name":"Jets"},{"club_name":"Giants"}]},
{"sport_name":"baseball","teams" : [ {"club_name":"Mets"},{"club_name":"Yankees"}]},
{"sport_name":"basketball","teams" : [{"club_name":"Nets"},{"club_name":"Knicks"}]}]}');
```

Query the table:

```sql
WITH t AS(
SELECT id, jsondata::city city , table_col AS sports_clubs FROM json_tab JOIN TABLE(JSON_TO_ARRAY(jsondata::sports_teams))),

t1 AS(
SELECT t.id, t.city, t.sports_clubs::sport_name sport, table_col AS clubs FROM t JOIN TABLE(JSON_TO_ARRAY(t.sports_clubs::teams)))

SELECT t1.id, t1.city,t1.sport,t1.clubs::club_name club_name FROM t1;

```

```output

+------+-------+--------------+-------------+
| id   | city  | sport        | club_name   |
+------+-------+--------------+-------------+
| 9876 | "NY"  | "football"   | "Jets"      |
| 9876 | "NY"  | "football"   | "Giants"    |
| 9876 | "NY"  | "baseball"   | "Mets"      |
| 9876 | "NY"  | "baseball"   | "Yankees"   |
| 9876 | "NY"  | "basketball" | "Nets"      |
| 9876 | "NY"  | "basketball" | "Knicks"    |
| 8765 | "SFO" | "football"   | "Raiders"   |
| 8765 | "SFO" | "football"   | "49ers"     |
| 8765 | "SFO" | "baseball"   | "As"        |
| 8765 | "SFO" | "baseball"   | "SF Giants" |
+------+-------+--------------+-------------+
```

You can also further filter the results by applying conditions. For example, to find the city with the "Yankees" club, run the following query:

```sql
WITH t AS
(SELECT id, jsondata::city city , table_col AS sports_clubs FROM json_tab JOIN TABLE(JSON_TO_ARRAY(jsondata::sports_teams))),

t1 AS
(SELECT t.id, t.city, t.sports_clubs::sport_name sport, table_col AS clubs FROM t JOIN TABLE(JSON_TO_ARRAY(t.sports_clubs::teams))) 

SELECT t1.id, t1.city,t1.sport,t1.clubs::club_name club_name FROM t1 WHERE t1.clubs::$club_name = 'Yankees';

```

```output

+------+------+------------+-----------+
| id   | city | sport      | club_name |
+------+------+------------+-----------+
| 9876 | "NY" | "baseball" | "Yankees" |
+------+------+------------+-----------+
```

## Nested JSON Ingest

Working with a nested JSON ingest requires an additional escape as the value being inserted is interpreted as a SQL string first. Therefore, the backslash (\\) symbol needs an escape in addition to the JSON escape (‘\’):

```sql
CREATE TABLE test_json(col_a json);

INSERT INTO test_json VALUES ('{"addParams": "{\\"Emp_Id\\":\\"1487\\", 
  \\"Emp_LastName\\":\\"Stephens\\",\\"Emp_FirstName\\":\\"Mark\\",\\"Dept\\":\\"Support\\"}"}');

```

```sql
SELECT * FROM test_json;

```

```output

+------------------------------------------------------------------------------------+
| col_a                                                                              |
+------------------------------------------------------------------------------------+
| {"addParams":"{\"Emp_Id\":\"1487\",                                                |
|   \"Emp_LastName\":\"Stephens\",\"Emp_FirstName\":\"Mark\",\"Dept\":\"Support\"}"} |
+------------------------------------------------------------------------------------+  
```

## Using Colon-Colon Notation in UPDATE Queries

You can use the colon-colon notation for JSON keypaths in [UPDATE](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/update.md) queries. For example, the following two `UPDATE` queries perform the same operation:

```sql
UPDATE users SET userdata::name::$first = 'Alex';
UPDATE users SET userdata = JSON_SET_STRING(userdata, 'name', 'first', 'Alex');

```

In fact, these access operators are simply convenient aliases for the `JSON_SET_<type>` built-in function (see [JSON\_SET\_\<type>](https://docs.singlestore.com/db/v9.1/reference/sql-reference/json-functions/json-set-type.md)), and they follow all the same rules.

## Field Name Syntax for JSON Access and UPDATE Queries

In both the JSON access and update contexts, each field name must either:

* follow the syntax for a valid SQL identifier, or
* be escaped with backticks in the same manner as a SQL identifier

For example, the following two [SELECT](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/select.md) queries perform the same operation:

```sql
SELECT ticker_symbol FROM stocks WHERE statistics::%`P/E` > 1.5;
SELECT ticker_symbol FROM stocks WHERE JSON_EXTRACT_DOUBLE(statistics, 'P/E') > 1.5;

```

When using the `JSON_EXTRACT_<type>` form of the query (see [JSON\_EXTRACT\_\<type>](https://docs.singlestore.com/db/v9.1/reference/sql-reference/json-functions/json-extract-type.md) ), there is no constraint on the contents of the keystring. A JSON object can contain any string as a key, even `""`, or a string with `"embedded\u0000nulls"`. For more information on extracting JSON data, see [JSON LOAD DATA](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/load-data/#UUID-390757f0-40b4-4d47-acec-d7c9a4c84ceb.md).

## JSON Performance

The following example will demonstrate how to improve JSON seekability by utilizing sub-segments.

Begin by turning off or disabling the variable:

```sql
SET GLOBAL use_seekable_json = OFF;
```

Create a database and change to the newly created database..

```sql
CREATE DATABASE orders;

USE orders;
```

Create a table with id as an integer and set as the primary key

```sql
CREATE TABLE orders(id INT PRIMARY KEY not null, order_doc json, sort KEY(id));
```

Create a stored procedure to generate documents until it has reached a specific number nearest to the power of two > = n.

```sql
DELIMITER //

CREATE or REPLACE PROCEDURE gen_orders(n INT) AS
DECLARE
  i INT;
  c INT;
  m INT;
BEGIN
  DELETE FROM orders;
  INSERT orders VALUES (1,
     '{ "id" : 1,
        "time" : "2022-11-17 23:03:54",
        "lineitems" : [{ "line_num" : 1,
                         "SKU" : 688968,
                         "description" : "Apple iPhone 14 Plus",
                         "qty" : 1},
                        { "line_num" : 2,
                         "SKU" : 6514052,
                         "description" : "Energizer - MAX AA Batteries (8 Pack), Double A Alkaline Batteries",
                         "qty" : 3},
                        { "line_num" : 3,
                         "SKU" : 6457697,
                         "description" : "Star Wars - Galactic Snackin Grogu",
                         "qty" : 1}] }'
        );

  i = 1;
  WHILE i < n LOOP
    SELECT MAX(id) INTO m FROM orders;
    INSERT INTO orders
    SELECT id + m, new_order_doc(order_doc, id + m)
    FROM orders;

    SELECT COUNT(*) INTO c FROM orders;
    i += c;
 END LOOP;
END
//
DELIMITER ;

```

Create a function that will change the input order\_doc into a new one with random data, and a new id.

```sql
DELIMITER //
CREATE or REPLACE FUNCTION new_order_doc(order_doc json, new_id INT) RETURNS json
AS
DECLARE
  j json;
  lineitems json;
  l1 json;
  l2 json;
  l3 json;
BEGIN
  j = json_set_double(order_doc, "id", new_id);
  j = json_set_string(j, "time", now());
  lineitems = json_extract_json(j, "lineitems");
  l1 = json_extract_json(lineitems, 0);
  l2 = json_extract_json(lineitems, 1);
  l3 = json_extract_json(lineitems, 2);
  lineitems = json_array_push_json('[]', mutate_lineitem(l1));
  lineitems = json_array_push_json(lineitems, mutate_lineitem(l2));
  lineitems = json_array_push_json(lineitems, mutate_lineitem(l3));
  j = json_set_json(j, "lineitems", lineitems);
  RETURN j;
END
//
DELIMITER ;

```

Create a function to use the mutate\_lineitem.

```sql
DELIMITER //
CREATE OR REPLACE FUNCTION mutate_lineitem(item json) RETURNS json
AS
DECLARE
  j json;
BEGIN
  j = json_set_double(item, "SKU", floor(rand()*10*1000*1000));
  j = json_set_string(j, "description", uuid());
  j = json_set_double(j, "qty", ceil(3*rand()));
  RETURN j;
END
//
DELIMITER ;
```

Generate eight million rows of test data.

```sql
CALL gen_orders(8*1024*1024);
```

Create a stored procedure that will measure the average lookup time.

```sql
DELIMITER //

CREATE OR REPLACE PROCEDURE get_avg_time(n INT) AS
DECLARE
  _id INT;
  _od json;
  _oid INT;
  m INT;
  st DATETIME(6);
  et DATETIME(6);
BEGIN
  SELECT MAX(id) INTO m FROM orders;
  st = now(6);
  FOR i in 1..n LOOP
    _oid = CEILING(m*RAND());

    SELECT id, order_doc
    INTO _id, _od
    FROM orders
    WHERE id = _oid;
  END LOOP;
  et = now(6);

  echo SELECT (timestampdiff(MICROSECOND, st, et)/1000000.0)/n AS avg_time;
END
//
DELIMITER ;
```

Before obtaining the average query time make sure the table has been optimized and all the data is in columnstore format.

```sql
OPTIMIZE TABLE orders FULL;
```

Obtain average query time.

```sql
CALL get_avg_time(100);


```

```output

+------------+
| avg_time   |
+------------+
| 1.73598285 |
+------------+
1 row in set (2 min 53.61 sec)
```

To measure the new query time, enable to variable.

```sql
SET GLOBAL use_seekable_json = ON
```

Create a second table using the same format as the first table.

```sql
CREATE TABLE orders2 LIKE orders;
```

Use the same data from the first table and make sure to optimize the new table.

```sql
INSERT INTO orders2 SELECT * FROM orders;
```

```sql
OPTIMIZE TABLE orders2 FULL;
```

Update the stored procedure `get_avg_time` by changing *orders* to *orders2* in both places.

```sql
DELIMITER //

CREATE OR REPLACE PROCEDURE get_avg_time(n INT) AS
DECLARE
  _id INT;
  _od json;
  _oid INT;
  m INT;
  st DATETIME(6);
  et DATETIME(6);
BEGIN
  SELECT MAX(id) INTO m FROM orders2;
  st = now(6);
  FOR i in 1..n LOOP
    _oid = CEILING(m*RAND());

    SELECT id, order_doc
    INTO _id, _od
    FROM orders2
    WHERE id = _oid;
  END LOOP;
  et = now(6);

  echo SELECT (timestampdiff(MICROSECOND, st, et)/1000000.0)/n AS avg_time;
END
//
DELIMITER ;
```

Run the `get_avg_time` stored procedure again with the changed table name.

```sql
CALL get_avg_time(100);


```

```output

+------------+
| avg_time   |
+------------+
| 0.03216373 |
+------------+
1 row in set (3.47 sec)

```

## JSON Array Performance

Aggregations over array objects using the `JOIN TABLE(JSON_TO_ARRAY())` operation are optimized in SingleStore to run efficiently on large batches of rows in parallel. The optimization also reduces the amount of data read from the JSON column by retrieving only the necessary data instead of fetching the entire JSON object.

The `JSON_TO_ARRAY` optimization requires the column created by the `TABLE` function (`table_col`) be wrapped in a `JSON_EXTRACT_<type>()` function. A `JSON_EXTRACT_<type>()` function can be used to directly extract a value as shown in Example 7 in the Introduction, or the shorthand (`::`) syntax can be used when there are keys to extract, as is shown in the following examples.

The following example uses the [TPC-H](https://www.tpc.org/tpch/) data. [Load this TPC-H data](https://docs.singlestore.com/db/v9.1/introduction/sample-data/load-tpc-h-data-into-singlestore.md) into your cluster, and run the following queries to modify the data for the current use case:

```sql
ALTER TABLE orders ADD COLUMN lineitems_json JSON;

CREATE TABLE new_table (
 l_orderkey bigint(11) NOT NULL,
 lineitems JSON
);

INSERT INTO new_table
SELECT l_orderkey,
      JSON_AGG(
          JSON_BUILD_OBJECT(
              'l_partkey', l_partkey,
              'l_suppkey', l_suppkey,
              'l_linenumber', l_linenumber,
              'l_quantity', l_quantity,
              'l_extendedprice', l_extendedprice,
              'l_discount', l_discount,
              'l_tax', l_tax,
              'l_returnflag', l_returnflag,
              'l_linestatus', l_linestatus,
              'l_shipdate', l_shipdate,
              'l_commitdate', l_commitdate,
              'l_receiptdate', l_receiptdate,
              'l_shipinstruct', l_shipinstruct,
              'l_shipmode', l_shipmode,
              'l_comment', l_comment
          )
      ) as lineitems
FROM lineitem
GROUP BY l_orderkey;

UPDATE orders o
JOIN new_table t ON o.o_orderkey = t.l_orderkey
SET o.lineitems_json = t.lineitems;
```

Here are a few sample query structures that benefit from the optimization.

* The following query calculates the total number of line items for each shipping mode by aggregating and grouping on the JSON array columns:
  ```sql
  SELECT t.table_col::$l_shipmode, sum(t.table_col::%l_quantity) as quantity
  FROM orders
  JOIN TABLE(JSON_TO_ARRAY(lineitems_json)) t
  GROUP BY t.table_col::$l_shipmode;

  ```
  ```output

  +--------------------------+-----------+
  | t.table_col::$l_shipmode | quantity  |
  +--------------------------+-----------+
  | MAIL                     | 218541962 |
  | FOB                      | 218469194 |
  | RAIL                     | 218553230 |
  | AIR                      | 218525124 |
  | REG AIR                  | 218617001 |
  | SHIP                     | 218576988 |
  | TRUCK                    | 218454537 |
  +--------------------------+-----------+
  ```
* The following query calculates the total number of line items for each order priority:
  ```sql
  SELECT o_orderpriority as priority, sum(t.table_col::$l_quantity) as quantity
  FROM orders JOIN TABLE(JSON_TO_ARRAY(lineitems_json)) t
  GROUP BY o_orderpriority;

  ```
  ```output

  +-----------------+-----------+
  | priority        | quantity  |
  +-----------------+-----------+
  | 3-MEDIUM        | 305731841 |
  | 5-LOW           | 305656541 |
  | 2-HIGH          | 306112515 |
  | 1-URGENT        | 306300507 |
  | 4-NOT SPECIFIED | 305936632 |
  +-----------------+-----------+
  ```
* The following query performs an intricate aggregation operation over multiple fields:
  ```sql
  SELECT
     t.table_col::$l_returnflag as r,
     t.table_col::$l_linestatus as s,
     sum(t.table_col::%l_quantity) as sum_qty,
     sum(t.table_col::%l_extendedprice) as sum_base_price,
     sum(t.table_col::%l_extendedprice * (1 - t.table_col::%l_discount)) as sum_disc_price,
     avg(t.table_col::%l_quantity) as avg_qty
  FROM orders
  JOIN TABLE(JSON_TO_ARRAY(lineitems_json)) t
  GROUP by r, s

  ```
  ```output

  +---+---+-----------+--------------------+--------------------+--------------------+
  | r | s | sum_qty   | sum_base_price     | sum_disc_price     | avg_qty            |
  +---+---+-----------+--------------------+--------------------+--------------------+
  | A | F | 377518399 |  566065727797.2633 | 537759104278.07465 | 25.500975103007097 |
  | R | F | 377732830 |  566431054976.0034 |  538110922664.7574 |  25.50838478968014 |
  | N | F |   9851614 | 14767438399.169971 | 14028805792.211407 | 25.522448302840946 |
  | N | O | 764635193 | 1146548935600.9531 | 1089215873201.9531 | 25.498214140059787 |
  +---+---+-----------+--------------------+--------------------+--------------------+
  ```

## Limitation

* The optimization is only applied to the first dimension of an array.

## Implementation Considerations

This section describes some unique behaviors in SingleStore’s implementation of the JSON standard (RFC 4627).

## Infinities and NaNs

SingleStore does not attempt to support entities such as the `nan` (“not a number”) entity, as in `{"value":nan}`. Although the JavaScript language supports `nan` and positive and negative infinities, neither the JSON standard nor SQL’s `DOUBLE` type provide any way to represent such non-finite values. If your application requires these special values, you might try using `null`, `0.0`, or strings (such as `"nan"` and `"inf"`) in their place.

## Unicode Support

JSON strings can contain any of the 65536 Unicode characters in the Basic Multilingual Plane (BMP), either directly as themselves or escaped with JSON’s `\uABCD` notation. JSON normalization converts valid `\uABCD` escape sequences and surrogate-pair sequences into UTF‑8 byte sequences in the stored JSON value.

In JSON, Unicode characters outside the BMP cannot be represented with a single `\uABCD` escape sequence because their codepoints exceed the 16-bit range. Instead, they are encoded using a pair of `\uABCD` escape sequences known as a surrogate pair. A surrogate pair consists of a high surrogate (in the range `\uD800` to `\uDBFF`) followed by a low surrogate (in the range `\uDC00` to `\uDFFF`).

For example, the winking face emoji  (U+1F609) is represented as the surrogate pair `\uD83D\uDE09`.

In SingleStore, a backslash (`\`) literal in a string must be escaped with a backslash. Therefore, pattern strings containing backslashes will have two backslash characters (`\\`).

Here is an example of how to use escaped notation for characters in the BMP:

```sql
SELECT '{"a":"\\u00F9"}' :> JSON;

```

```output

+---------------------------+
| '{"a":"\\u00F9"}' :> JSON |
+---------------------------+
| {"a":"ù"}                 |
+---------------------------+
```

A surrogate-pair example for a non‑BMP character (the winking face emoji ):

```
SELECT '{"msg":"Hello \\uD83D\\uDE09"}' :> JSON;

```

```output

+----------------------------------------+
|'{"msg":"Hello \\uD83D\\uDE09"}' :> JSON|
+----------------------------------------+
| {"msg":"Hello "}                     |
+----------------------------------------+
```

Like MySQL,When using the `utf8mb4` character set, SingleStore supports characters outside the Basic Multilingual Plane (characters whose codepoints are in the range 0x10000 to 0x10FFFF) in JSON strings. This includes some uncommon Chinese characters and symbols such as emojis.These non-BMP characters can be inserted either directly or using UTF‑16 surrogate pair notation.

The `json_collation` engine variable must be set to `utf8mb4_bin` for these characters to work. Refer to [Sync Variables Lists](https://docs.singlestore.com/db/v9.1/reference/configuration-reference/engine-variables/list-of-engine-variables/#sync-variables-lists-2.md) for more information.

## Character Encodings and Collation

`Collation` provides a set of rules to use when comparing characters in a character set. By default, JSON columns use the collation specified in the `json_collation` engine variable ([Sync Variables Lists](https://docs.singlestore.com/db/v9.1/reference/configuration-reference/engine-variables/list-of-engine-variables/#sync-variables-lists-3.md)). The collation of a JSON column affects the following:

* Output of `SELECT DISTINCT`, `ORDER BY`, and other queries that compare entire values.
* Identification of duplicate keys inside a single JSON object during normalization. For example, whether the string `{"Schlüssel":1,"Schluessel":2}` is normalized to `{"Schluessel":2}`.
* Sorting of keys inside a single JSON object. For example, whether the string `{"Schlüssel":1,"Schluff":2}` is normalized to `{"Schluff":2,"Schlüssel":1}`.

The default collation generally provides the desired behavior. However, you can override the default at the table or column level or by using the `json_collation` engine variable.

> **📝 Note**: SingleStore recommends that you use the `utf8mb4_bin` collation for all JSON columns.

## Table-level Override

In the following example, for the `users` table, both `userdata` and `lastname` use the table collation (which has been specified as `utf8_unicode_ci`).

```sql
CREATE TABLE users (
   uid INT AUTO_INCREMENT PRIMARY KEY,
   userdata JSON,
   lastname AS data::name::$last PERSISTED TEXT) COLLATE=utf8_unicode_ci;

```

## Column-level Override

In the following example, for the `orders` table:

* The `data` column uses `utf8_unicode_ci`.
* The `productdetails` column uses `utf8_bin`.
* Both the `productname` column and the `comments` column use `utf8_general_ci`, which is SingleStore’s default database collation

```sql
CREATE TABLE orders (
    oid INT AUTO_INCREMENT PRIMARY KEY,
    data JSON COLLATE utf8_unicode_ci,
    productname AS data::product::$name PERSISTED TEXT,
    productdetails AS data::product::$details PERSISTED TEXT COLLATE utf8_bin,
    comments VARCHAR(256));

```

## Expression-level Override

The `:>` operator allows you to specify a collation for different expressions in a query.

```sql
:> text collate <colation_name>

```

The following example uses a binary collation (`utf8_bin`) for the first expression and a case-insensitive collation (`utf8_general_ci`) for the second expression.

```sql
SELECT *
FROM sets
WHERE sets.json_field::$x :> text collate utf8_bin = "string1"
AND sets.json_field::$y :> text collate utf8_general_ci = "string2";

```

## Maximum JSON Value Size

Under the hood, JSON data is stored in `LONGBLOB` columns. While the DDL will allow you to specify a length of up to 4GB, there is an internal limit applied when assigning a value to a string or JSON field that caps the maximum size of a single value to `max_allowed_packet`. This is 100MB by default and can be set to up to 1GB.

## Performance Impact on Parquet Encoding for JSON

When using seekable JSON/JSON Parquet encoding a major impact on performance is if a schema is dense or sparse.

A node in a tree is considered to be dense if it occurs in most JSON rows. A schema is said to be dense if most of the nodes in the entire schema are dense. Otherwise, the schema is considered to be sparse.

Consider the following JSON data table:

| JSON Data (dense)       |
| ----------------------- |
| `{"a":1, "b":1}`        |
| `{“a”:2, “c”:{“d”: 1}}` |
| `{“c”:{“d”:2, “e”:3}}`  |

The schema will infer the JSON rows as follows:

![](https://images.contentstack.io/v3/assets/bltac01ee6daa3a1e14/blt4c1dfd3f6ed174a9/6a2c425d9b3a1023b5a5e0f5/Dense_Schema-HvoMbx.png)

Using the image above the leaves of the tree become internal columns in the JSON encoding. This is an example of a dense schema.

| a    | b    | c::d | c::e |
| ---- | ---- | ---- | ---- |
| 1    | 1    | NULL | NULL |
| 2    | NULL | 1    | NULL |
| NULL | NULL | 2    | 3    |

Using the following JSON data table:

| JSON Data (sparse) |
| ------------------ |
| `{“a”: 1}`         |
| `{“b”: 1}`         |
| `{“c”: 1}`         |
| `{“d”: 1}`         |
| `{“e”: 1}`         |

The JSON rows will be encoded as follows:

| a    | b    | c    | d    | e    |
| ---- | ---- | ---- | ---- | ---- |
| 1    | NULL | NULL | NULL | NULL |
| NULL | 1    | NULL | NULL | NULL |
| NULL | NULL | 1    | NULL | NULL |
| NULL | NULL | NULL | 1    | NULL |
| NULL | NULL | NULL | NULL | 1    |

The preceding table represents a sparse schema caused by the NULLs not being part of the original JSON strings. This results in poor performance since the NULLs will need to be counted which increases the execution time and memory usage.

To prevent an overly sparse schema from being inferred, SingleStore uses a method where if the average of a key’s children is too low (<1%) in relation to the key itself, the key is stored as a string without inferring its children.

## Related Topics

* See: [JSON\_AGG](https://docs.singlestore.com/db/v9.1/reference/sql-reference/json-functions/json-agg/#section-idm4605787636892833501997993222.md)
* [JSON\_COLUMN\_SCHEMA](https://docs.singlestore.com/db/v9.1/reference/information-schema-reference/segment-column-and-merge-status/json-column-schema.md)
* Training: [Working with JSON](https://training.singlestore.com/learn/course/internal/view/elearning/632/working-with-json-data)

***

Modified at: March 2, 2026

Source: [/db/v9.1/create-a-database/using-json/](https://docs.singlestore.com/db/v9.1/create-a-database/using-json/)

(An index of the documentation is available at /llms.txt)
